Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Trailing Slashes for Magento CMS pages - 2 URLS - Duplicate content
-
Hello,
Can anyone help me find a solution to Fixing and Creating Magento CMS pages to only use one URL and not two URLS?
I found a previous article that applies to my issue, which is using htaccess to redirect request for pages in magento 301 redirect to slash URL from the non-slash URL. I dont understand the syntax fully in htaccess , but I used this code below.
This code below fixed the CMS page redirection but caused issues on other pages, like all my categories and products with this error:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Assuming you're running at domain root. Change to working directory if needed.
RewriteBase /
# www check
If you're running in a subdirectory, then you'll need to add that in
to the redirected url (http://www.mydomain.com/subdirectory/$1
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [R=301,L]Trailing slash check
Don't fix direct file links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ $1/ [L,R=301]Finally, forward everything to your front-controller (index.php)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [QSA,L] -
301's are not difficult for me, but handling the code for a logic to re-route requests for "URL" to "URL/" is something I dont know how to do. I can manually 301 or rel canonical my CMS pages on Magento everytime, but that defeats the purpose or the automation in htaccess I am trying to get working.
thanks
-
Thank You Kevin.
This is almost the default Magento htaccess file(out of the box), I think I had a couple entries to fix a couple other issues, the code I just added that isnt working is in the middle of the htaccess, its commented starting with this: ** "## slash removal re-write done by ALEX MEADE for iamgreenminded.com**
uncomment these lines for CGI mode
make sure to specify the correct cgi php binary file name
it might be /cgi-bin/php-cgi
Action php5-cgi /cgi-bin/php5-cgi
AddHandler php5-cgi .php
############################################
GoDaddy specific options
Options -MultiViews
you might also need to add this line to php.ini
cgi.fix_pathinfo = 1
if it still doesn't work, rename php.ini to php5.ini
############################################
this line is specific for 1and1 hosting
#AddType x-mapp-php5 .php
#AddHandler x-mapp-php5 .php############################################
default index file
DirectoryIndex index.php
############################################
adjust memory limit
php_value memory_limit 64M
php_value memory_limit 256M
php_value max_execution_time 18000############################################
disable magic quotes for php request vars
php_flag magic_quotes_gpc off
############################################
disable automatic session start
before autoload was initialized
php_flag session.auto_start off
############################################
enable resulting html compression
#php_flag zlib.output_compression on
###########################################
disable user agent verification to not break multiple image upload
php_flag suhosin.session.cryptua off
###########################################
turn off compatibility with PHP4 when dealing with objects
php_flag zend.ze1_compatibility_mode Off
<ifmodule mod_security.c="">###########################################
disable POST processing to not break multiple image upload</ifmodule>
SecFilterEngine Off
SecFilterScanPOST Off############################################
enable apache served files compression
http://developer.yahoo.com/performance/rules.html#gzip
Insert filter on all content
###SetOutputFilter DEFLATE
Insert filter on selected content types only
#AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript
Netscape 4.x has some problems...
#BrowserMatch ^Mozilla/4 gzip-only-text/html
Netscape 4.06-4.08 have some more problems
#BrowserMatch ^Mozilla/4.0[678] no-gzip
MSIE masquerades as Netscape, but it is fine
#BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
Don't compress images
#SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary
Make sure proxies don't deliver the wrong content
#Header append Vary User-Agent env=!dont-vary
############################################
make HTTPS env vars available for CGI mode
SSLOptions StdEnvVars
############################################
enable rewrites
Options +FollowSymLinks
RewriteEngine on############################################
slash removal re-write done by ALEX MEADE for iamgreenminded.com
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(.)/$
RewriteCond %{REQUEST_FILENAME} !.(gif|jpg|png|jpeg|css|js)$ [NC]
RewriteRule ^(.)$ http://%{HTTP_HOST}/$1/ [L,R=301]
########################################################################################
you can put here your magento root folder
path relative to web root
#RewriteBase /magento/
############################################
uncomment next line to enable light API calls processing
RewriteRule ^api/([a-z][0-9a-z_]+)/?$ api.php?type=$1 [QSA,L]
############################################
rewrite API2 calls to api.php (by now it is REST only)
RewriteRule ^api/rest api.php?type=rest [QSA,L]
############################################
workaround for HTTP authorization
in CGI environment
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
############################################
TRACE and TRACK HTTP methods disabled to prevent XSS attacks
RewriteCond %{REQUEST_METHOD} ^TRAC[EK]
RewriteRule .* - [L,R=405]############################################
redirect for mobile user agents
#RewriteCond %{REQUEST_URI} !^/mobiledirectoryhere/.$
#RewriteCond %{HTTP_USER_AGENT} "android|blackberry|ipad|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
#RewriteRule ^(.)$ /mobiledirectoryhere/ [L,R=302]############################################
always send 404 on missing files in these folders
RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
############################################
never rewrite for existing files, directories and links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l############################################
rewrite everything else to index.php
RewriteRule .* index.php [L]
############################################
Prevent character encoding issues from server overrides
If you still have problems, use the second line instead
AddDefaultCharset Off
#AddDefaultCharset UTF-8############################################
Add default Expires header
http://developer.yahoo.com/performance/rules.html#expires
ExpiresDefault "access plus 1 year"
############################################
By default allow all access
Order allow,deny
Allow from all###########################################
Deny access to release notes to prevent disclosure of the installed Magento version
<files release_notes.txt="">order allow,deny
deny from all</files>############################################
If running in cluster environment, uncomment this
http://developer.yahoo.com/performance/rules.html#etags
#FileETag none
Permanent URL redirect - generated by www.rapidtables.com
Redirect 301 /thebirdword http://www.thebirdword.com
-
You probably have other redirects in your .htaccess and possibly in your website code. The order of your rewrites is also important. Publish your Apache config and I'll take a look.
FYI, there are better resources for technical issue than MOZ. Most here are not developers/IT specialists; we're more like SEO strategists and business managers.
-
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ http://domain.com/$1/ [L,R=301]I have found both of the articles you linked here, nothing is working - any code I try gives me the same error on most of my pages:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Still need a fix for this
thanks
-
Yes, server redirects are necessary. Try these solutions to see which one works for you:
http://ralphvanderpauw.com/seo/how-to-301-redirect-a-trailing-slash-in-htaccess/
http://enarion.net/web/htaccess/trailing-slash/
You might want to consider moving to Nginx. You'll notice amazing speed and stability improvement with Nginx, Redis Session Cache, Memcached, OpCache, Ngx_pagespeed, and Magento Cache Storage Management. I can help much more with Nginx redirects and conf files--I gave up Apache years ago. Sorry I couldn't be of more help.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content on URL trailing slash
Hello, Some time ago, we accidentally made changes to our site which modified the way urls in links are generated. At once, trailing slashes were added to many urls (only in links). Links that used to send to
Intermediate & Advanced SEO | | yacpro13
example.com/webpage.html Were now linking to
example.com/webpage.html/ Urls in the xml sitemap remained unchanged (no trailing slash). We started noticing duplicate content (because our site renders the same page with or without the trailing shash). We corrected the problematic php url function so that now, all links on the site link to a url without trailing slash. However, Google had time to index these pages. Is implementing 301 redirects required in this case?1 -
Duplicate content due to parked domains
I have a main ecommerce website with unique content and decent back links. I had few domains parked on the main website as well specific product pages. These domains had some type in traffic. Some where exact product names. So main main website www.maindomain.com had domain1.com , domain2.com parked on it. Also had domian3.com parked on www.maindomain.com/product1. This caused lot of duplicate content issues. 12 months back, all the parked domains were changed to 301 redirects. I also added all the domains to google webmaster tools. Then removed main directory from google index. Now realize few of the additional domains are indexed and causing duplicate content. My question is what other steps can I take to avoid the duplicate content for my my website 1. Provide change of address in Google search console. Is there any downside in providing change of address pointing to a website? Also domains pointing to a specific url , cannot provide change of address 2. Provide a remove page from google index request in Google search console. It is temporary and last 6 months. Even if the pages are removed from Google index, would google still see them duplicates? 3. Ask google to fetch each url under other domains and submit to google index. This would hopefully remove the urls under domain1.com and doamin2.com eventually due to 301 redirects. 4. Add canonical urls for all pages in the main site. so google will eventually remove content from doman1 and domain2.com due to canonical links. This wil take time for google to update their index 5. Point these domains elsewhere to remove duplicate contents eventually. But it will take time for google to update their index with new non duplicate content. Which of these options are best best to my issue and which ones are potentially dangerous? I would rather not to point these domains elsewhere. Any feedback would be greatly appreciated.
Intermediate & Advanced SEO | | ajiabs0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
Tabs and duplicate content?
We own this site http://www.discountstickerprinting.co.uk/ and just a little concerned as I right clicked open in new tab on the tab content section and it went to a new page For example if you right click on the price tab and click open in new tab you will end up with the url
Intermediate & Advanced SEO | | BobAnderson
http://www.discountstickerprinting.co.uk/#tabThree Does this mean that our content is being duplicated onto another page? If so what should I do?0 -
Duplicate content on sites from different countries
Hi, we have a client who currently has a lot of duplicate content with their UK and US website. Both websites are geographically targeted (via google webmaster tools) to their specific location and have the appropriate local domain extension. Is having duplicate content a major issue, since they are in two different countries and geographic regions of the world? Any statement from Google about this? Regards, Bill
Intermediate & Advanced SEO | | MBASydney0 -
Is an RSS feed considered duplicate content?
I have a large client with satellite sites. The large site produces many news articles and they want to put an RSS feed on the satellite sites that will display the articles from the large site. My question is, will the rss feeds on the satellite sites be considered duplicate content? If yes, do you have a suggestion to utilize the data from the large site without being penalized? If no, do you have suggestions on what tags should be used on the satellite pages? EX: wrapped in tags? THANKS for the help. Darlene
Intermediate & Advanced SEO | | gXeSEO0 -
Trailing Slash: Lost in Redirection?
Question here, but first the lead in. As you all know, 301 redirects don't pass on 100% of link juice. I've set up my site using htaccess to redirect all non-ww to www and redirect all URLs to have a trailing slash. FYI, the preferred domain is selected in WMT and canonical URLs appear in the head section of all pages. So now what happens when sites that link to mine don't include either the www or the trailing slash, which is actually quite common? Of course, asking the site own to correct the link is ideal, but that's not always possible. So if thousands of links on external sites are linking to http://www.site.com instead of http://www.site.com/, won't lots of link juice get lost in redirection? I can't think of anything more I can do to the URLs to reduce duplicate content and juice dilution. Thoughts? Kevin
Intermediate & Advanced SEO | | kwoolf0 -
Multiple URLs for the same page
I am working with a client and recently discovered that they have several URLs that go to the same page. http://www.maps.com/FunFacts.aspx
Intermediate & Advanced SEO | | WebMarketingandDesign
http://www.maps.com/funfacts.aspx
http://www.maps.com/FunFacts.aspx?nav=FF
http://www.maps.com/FunFacts.aspx?nav=FS
http://www.maps.com/funfacts.aspx?nav=FF
http://www.maps.com/funfacts.aspx?nav=ffhttp://www.maps.com/FunFacts.aspx?nav=MShttp://www.maps.com/funfacts.aspx?nav=
http://www.maps.com/FunFacts.aspx?nav=FF#
http://www.maps.com/FunFacts
http://www.maps.com/funfacts.aspx?.nav=FF I am afraid this is happening all over the site. So, my question is: Is this hurting the SEO and how? If so what is the best way to go about fixing this problem? Thanks for your help!0