Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Trailing Slashes for Magento CMS pages - 2 URLS - Duplicate content
-
Hello,
Can anyone help me find a solution to Fixing and Creating Magento CMS pages to only use one URL and not two URLS?
I found a previous article that applies to my issue, which is using htaccess to redirect request for pages in magento 301 redirect to slash URL from the non-slash URL. I dont understand the syntax fully in htaccess , but I used this code below.
This code below fixed the CMS page redirection but caused issues on other pages, like all my categories and products with this error:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Assuming you're running at domain root. Change to working directory if needed.
RewriteBase /
# www check
If you're running in a subdirectory, then you'll need to add that in
to the redirected url (http://www.mydomain.com/subdirectory/$1
RewriteCond %{HTTP_HOST} !^www. [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [R=301,L]Trailing slash check
Don't fix direct file links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ $1/ [L,R=301]Finally, forward everything to your front-controller (index.php)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [QSA,L] -
301's are not difficult for me, but handling the code for a logic to re-route requests for "URL" to "URL/" is something I dont know how to do. I can manually 301 or rel canonical my CMS pages on Magento everytime, but that defeats the purpose or the automation in htaccess I am trying to get working.
thanks
-
Thank You Kevin.
This is almost the default Magento htaccess file(out of the box), I think I had a couple entries to fix a couple other issues, the code I just added that isnt working is in the middle of the htaccess, its commented starting with this: ** "## slash removal re-write done by ALEX MEADE for iamgreenminded.com**
uncomment these lines for CGI mode
make sure to specify the correct cgi php binary file name
it might be /cgi-bin/php-cgi
Action php5-cgi /cgi-bin/php5-cgi
AddHandler php5-cgi .php
############################################
GoDaddy specific options
Options -MultiViews
you might also need to add this line to php.ini
cgi.fix_pathinfo = 1
if it still doesn't work, rename php.ini to php5.ini
############################################
this line is specific for 1and1 hosting
#AddType x-mapp-php5 .php
#AddHandler x-mapp-php5 .php############################################
default index file
DirectoryIndex index.php
############################################
adjust memory limit
php_value memory_limit 64M
php_value memory_limit 256M
php_value max_execution_time 18000############################################
disable magic quotes for php request vars
php_flag magic_quotes_gpc off
############################################
disable automatic session start
before autoload was initialized
php_flag session.auto_start off
############################################
enable resulting html compression
#php_flag zlib.output_compression on
###########################################
disable user agent verification to not break multiple image upload
php_flag suhosin.session.cryptua off
###########################################
turn off compatibility with PHP4 when dealing with objects
php_flag zend.ze1_compatibility_mode Off
<ifmodule mod_security.c="">###########################################
disable POST processing to not break multiple image upload</ifmodule>
SecFilterEngine Off
SecFilterScanPOST Off############################################
enable apache served files compression
http://developer.yahoo.com/performance/rules.html#gzip
Insert filter on all content
###SetOutputFilter DEFLATE
Insert filter on selected content types only
#AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript
Netscape 4.x has some problems...
#BrowserMatch ^Mozilla/4 gzip-only-text/html
Netscape 4.06-4.08 have some more problems
#BrowserMatch ^Mozilla/4.0[678] no-gzip
MSIE masquerades as Netscape, but it is fine
#BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
Don't compress images
#SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary
Make sure proxies don't deliver the wrong content
#Header append Vary User-Agent env=!dont-vary
############################################
make HTTPS env vars available for CGI mode
SSLOptions StdEnvVars
############################################
enable rewrites
Options +FollowSymLinks
RewriteEngine on############################################
slash removal re-write done by ALEX MEADE for iamgreenminded.com
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !(.)/$
RewriteCond %{REQUEST_FILENAME} !.(gif|jpg|png|jpeg|css|js)$ [NC]
RewriteRule ^(.)$ http://%{HTTP_HOST}/$1/ [L,R=301]
########################################################################################
you can put here your magento root folder
path relative to web root
#RewriteBase /magento/
############################################
uncomment next line to enable light API calls processing
RewriteRule ^api/([a-z][0-9a-z_]+)/?$ api.php?type=$1 [QSA,L]
############################################
rewrite API2 calls to api.php (by now it is REST only)
RewriteRule ^api/rest api.php?type=rest [QSA,L]
############################################
workaround for HTTP authorization
in CGI environment
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
############################################
TRACE and TRACK HTTP methods disabled to prevent XSS attacks
RewriteCond %{REQUEST_METHOD} ^TRAC[EK]
RewriteRule .* - [L,R=405]############################################
redirect for mobile user agents
#RewriteCond %{REQUEST_URI} !^/mobiledirectoryhere/.$
#RewriteCond %{HTTP_USER_AGENT} "android|blackberry|ipad|iphone|ipod|iemobile|opera mobile|palmos|webos|googlebot-mobile" [NC]
#RewriteRule ^(.)$ /mobiledirectoryhere/ [L,R=302]############################################
always send 404 on missing files in these folders
RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
############################################
never rewrite for existing files, directories and links
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-l############################################
rewrite everything else to index.php
RewriteRule .* index.php [L]
############################################
Prevent character encoding issues from server overrides
If you still have problems, use the second line instead
AddDefaultCharset Off
#AddDefaultCharset UTF-8############################################
Add default Expires header
http://developer.yahoo.com/performance/rules.html#expires
ExpiresDefault "access plus 1 year"
############################################
By default allow all access
Order allow,deny
Allow from all###########################################
Deny access to release notes to prevent disclosure of the installed Magento version
<files release_notes.txt="">order allow,deny
deny from all</files>############################################
If running in cluster environment, uncomment this
http://developer.yahoo.com/performance/rules.html#etags
#FileETag none
Permanent URL redirect - generated by www.rapidtables.com
Redirect 301 /thebirdword http://www.thebirdword.com
-
You probably have other redirects in your .htaccess and possibly in your website code. The order of your rewrites is also important. Publish your Apache config and I'll take a look.
FYI, there are better resources for technical issue than MOZ. Most here are not developers/IT specialists; we're more like SEO strategists and business managers.
-
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !example.php
RewriteCond %{REQUEST_URI} !(.)/$
RewriteRule ^(.)$ http://domain.com/$1/ [L,R=301]I have found both of the articles you linked here, nothing is working - any code I try gives me the same error on most of my pages:
"This webpage has a redirect loop
ERR_TOO_MANY_REDIRECTS"
Still need a fix for this
thanks
-
Yes, server redirects are necessary. Try these solutions to see which one works for you:
http://ralphvanderpauw.com/seo/how-to-301-redirect-a-trailing-slash-in-htaccess/
http://enarion.net/web/htaccess/trailing-slash/
You might want to consider moving to Nginx. You'll notice amazing speed and stability improvement with Nginx, Redis Session Cache, Memcached, OpCache, Ngx_pagespeed, and Magento Cache Storage Management. I can help much more with Nginx redirects and conf files--I gave up Apache years ago. Sorry I couldn't be of more help.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I switch from trailing slash to no trailing slash?
I have a website which has had trailing slashes added to the URLs by 301 redirects for over 3 years. However, the custom CMS does not allow navigation links to have trailing slashes. This is resulting in 301s every time a user clicks a navigation link. The site ranks fairy well for some moderately competitive keywords. If you were in my shoes, would you remove the forced trailing slash redirect in the .htaccess and replace it with a trailing slash removal redirect, or would you leave it like it is? Thanks,
Intermediate & Advanced SEO | | ICON_Malta
James p.s. the CMS also doesn't allow canonicals.0 -
Help with facet URLs in Magento
Hi Guys, Wondering if I can get some technical help here... We have our site britishbraces.co.uk , built in Magento. As per eCommerce sites, we have paginated pages throughout. These have rel=next/prev implemented but not correctly ( as it is not in is it in ) - this fix is in process. Our canonicals are currently incorrect as far as I believe, as even when content is filtered, the canonical takes you back to the first page URL. For example, http://www.britishbraces.co.uk/braces/x-style.html?ajaxcatalog=true&brand=380&max=51.19&min=31.19 Canonical to... http://www.britishbraces.co.uk/braces/x-style.html Which I understand to be incorrect. As I want the coloured filtered pages to be indexed ( due to search volume for colour related queries ), but I don't want the price filtered pages to be indexed - I am unsure how to implement the solution? As I understand, because rel=next/prev implemented ( with no View All page ), the rel=canonical is not necessary as Google understands page 1 is the first page in the series. Therefore, once a user has filtered by colour, there should then be a canonical pointing to the coloured filter URL? ( e.g. /product/black ) But when a user filters by price, there should be noindex on those URLs ? Or can this be blocked in robots.txt prior? My head is a little confused here and I know we have an issue because our amount of indexed pages is increasing day by day but to no solution of the facet urls. Can anybody help - apologies in advance if I have confused the matter. Thanks
Intermediate & Advanced SEO | | HappyJackJr0 -
Duplicate Content through 'Gclid'
Hello, We've had the known problem of duplicate content through the gclid parameter caused by Google Adwords. As per Google's recommendation - we added the canonical tag to every page on our site so when the bot came to each page they would go 'Ah-ha, this is the original page'. We also added the paramter to the URL parameters in Google Wemaster Tools. However, now it seems as though a canonical is automatically been given to these newly created gclid pages; below https://www.google.com.au/search?espv=2&q=site%3Awww.mypetwarehouse.com.au+inurl%3Agclid&oq=site%3A&gs_l=serp.3.0.35i39l2j0i67l4j0i10j0i67j0j0i131.58677.61871.0.63823.11.8.3.0.0.0.208.930.0j3j2.5.0....0...1c.1.64.serp..8.3.419.nUJod6dYZmI Therefore these new pages are now being indexed, causing duplicate content. Does anyone have any idea about what to do in this situation? Thanks, Stephen.
Intermediate & Advanced SEO | | MyPetWarehouse0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
International SEO - cannibalisation and duplicate content
Hello all, I look after (in house) 3 domains for one niche travel business across three TLDs: .com .com.au and co.uk and a fourth domain on a co.nz TLD which was recently removed from Googles index. Symptoms: For the past 12 months we have been experiencing canibalisation in the SERPs (namely .com.au being rendered in .com) and Panda related ranking devaluations between our .com site and com.au site. Around 12 months ago the .com TLD was hit hard (80% drop in target KWs) by Panda (probably) and we began to action the below changes. Around 6 weeks ago our .com TLD saw big overnight increases in rankings (to date a 70% averaged increase). However, almost to the same percentage we saw in the .com TLD we suffered significant drops in our .com.au rankings. Basically Google seemed to switch its attention from .com TLD to the .com.au TLD. Note: Each TLD is over 6 years old, we've never proactively gone after links (Penguin) and have always aimed for quality in an often spammy industry. **Have done: ** Adding HREF LANG markup to all pages on all domain Each TLD uses local vernacular e.g for the .com site is American Each TLD has pricing in the regional currency Each TLD has details of the respective local offices, the copy references the lacation, we have significant press coverage in each country like The Guardian for our .co.uk site and Sydney Morning Herlad for our Australia site Targeting each site to its respective market in WMT Each TLDs core-pages (within 3 clicks of the primary nav) are 100% unique We're continuing to re-write and publish unique content to each TLD on a weekly basis As the .co.nz site drove such little traffic re-wrting we added no-idex and the TLD has almost compelte dissapread (16% of pages remain) from the SERPs. XML sitemaps Google + profile for each TLD **Have not done: ** Hosted each TLD on a local server Around 600 pages per TLD are duplicated across all TLDs (roughly 50% of all content). These are way down the IA but still duplicated. Images/video sources from local servers Added address and contact details using SCHEMA markup Any help, advice or just validation on this subject would be appreciated! Kian
Intermediate & Advanced SEO | | team_tic1 -
Magento: URLs for Products in Multiple Categories
I am working in Magento to build out a large e-commerce site with several thousand products. It's a great platform, but I have run into the issue of what it does to URLs when you put a product into multiple categories. Basically, "a book" in two categories would make two URLs for one product: 1) /books/a-book 2) author-name/a-book So, I need to come up with a solution for this. It seems I have two options: Found this from a Magento SEO article: 'Magento gives you the ability to add the name of categories to path for product URL's. Because Magento doesn't support this functionality very well - it creates duplicate content issues - it is a very good idea to disable this. To do this, go to System => Configuration => Catalog => Search Engine Optimization and set "Use categories path for product URL's to "no".' This would solve the issues and be a quick fix, but I think it's a double edged sword, because then we lose the SEO value of our well named categories being in the URL. Use Canonical tags. To be fair, I'm not even sure this is possible. Even though it is creating different URLs and, thus, poses a risk of "duplicate content" being crawled, there really is only one page on the admin side. So, I can't go to all of the "duplicate" pages and put a canonical tag, because those duplicate pages don't really exist on the back-end. Does that make sense? After typing this out, it seems like the best thing to do probably will be to just turn off categories in the URL from the admin side. However, I'd still love any input from the community on this. Thanks!
Intermediate & Advanced SEO | | Marketing.SCG0 -
Stuck on Page 2 - What Would You Do?!?
My site is : http://goo.gl/JgK1e My main keyword is : Plastic Bins i have been going back and forth between page 1 and 2 for this keyword and i was wondering if any of you could provide any guidance as to why i can't get on the top of page 1, and stay there... My site has been around for a while, we believe we have a great user experience, all unique, fresh content, and the lowest prices... I must be missing out on something major if I cannot get a steady page 1 ranking... Any thoughts? Thanks in advance...
Intermediate & Advanced SEO | | Prime850 -
How to resolve Duplicate Page Content issue for root domain & index.html?
SEOMoz returns a Duplicate Page Content error for a website's index page, with both domain.com and domain.com/index.html isted seperately. We had a rewrite in the htacess file, but for some reason this has not had an impact and we have since removed it. What's the best way (in an HTML website) to ensure all index.html links are automatically redirected to the root domain and these aren't seen as two separate pages?
Intermediate & Advanced SEO | | ContentWriterMicky0