Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Remove html file extension and 301 redirects
-
Hi
Recently I ask for some work done on my website from a company, but I am not sure what they've done is right.
What I wanted was html file extensions to be removed like
/ash-logs.html to /ash-logs
also the index.html to www.timports.co.uk
I have done a crawl diagnostics and have duplicate page content and 32 page title duplicates. This is so doing my head in please helpThis is what is in the .htaccess file
<ifmodule pagespeed_module="">ModPagespeed on
ModPagespeedEnableFilters extend_cache,combine_css, collapse_whitespace,move_css_to_head, remove_comments</ifmodule><ifmodule mod_headers.c="">Header set Connection keep-alive</ifmodule>
<ifmodule mod_rewrite.c="">Options +FollowSymLinks -MultiViews</ifmodule>
DirectoryIndex index.html
RewriteEngine On
#Rewrite valid requests on .html files RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html?rw=1 [L,QSA]
#Return 404 on direct requests against .html files
RewriteCond %{REQUEST_URI} .html$
RewriteCond %{QUERY_STRING} !rw=1 [NC]
RewriteRule ^ - [R=404]AddCharset UTF-8 .html # <filesmatch “.(js|css|html|htm|php|xml|swf|flv|ashx)$”="">#SetOutputFilter DEFLATE #</filesmatch>
<ifmodule mod_expires.c="">ExpiresActive On
ExpiresByType image/gif "access plus 1 years"
ExpiresByType image/jpeg "access plus 1 years"
ExpiresByType image/png "access plus 1 years"
ExpiresByType image/x-icon "access plus 1 years"
ExpiresByType image/jpg "access plus 1 years"
ExpiresByType text/css "access 1 years"
ExpiresByType text/x-javascript "access 1 years"
ExpiresByType application/javascript "access 1 years"
ExpiresByType image/x-icon "access 1 years"</ifmodule><files 403.shtml="">order allow,deny allow from all</files>
redirect 301 /PRODUCTS http://www.timports.co.uk/kiln-dried-logs
redirect 301 /kindling_firewood.html http://www.timports.co.uk/kindling-firewood.html
redirect 301 /about_us.html http://www.timports.co.uk/about-us.html
redirect 301 /log_delivery.html http://www.timports.co.uk/log-delivery.html redirect 301 /oak_boards_delivery.html http://www.timports.co.uk/oak-boards-delivery.html
redirect 301 /un_edged_oak_boards.html http://www.timports.co.uk/un-edged-oak-boards.html
redirect 301 /wholesale_logs.html http://www.timports.co.uk/wholesale-logs.html redirect 301 /privacy_policy.html http://www.timports.co.uk/privacy-policy.html redirect 301 /payment_failed.html http://www.timports.co.uk/payment-failed.html redirect 301 /payment_info.html http://www.timports.co.uk/payment-info.html -
This looks good to me, the html pages are 301ing to the non .html versions.
-
I think I've done it this is what I have found and added to my htaccess code.
<ifmodule mod_rewrite.c="">
Options +FollowSymLinks -MultiViews</ifmodule>DirectoryIndex index.html
RewriteEngine On
RewriteBase /#removing trailing slash
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L]#non www to www
RewriteCond %{HTTP_HOST} !^www.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]#html
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^.]+)$ $1.html [NC,L]#index redirect
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.html\ HTTP/
RewriteRule ^index.html$ http://www.timports.co.uk/ [R=301,L]
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L] -
I still have the internal error, thank you for your time in looking at this I will keep trying
-
Hi,
htaccess can be a pain and I will admit I usually manage what I am after with a bit of trial and error. Try the following, and if you have problems concentrate on the lines:
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L]I have added a redirect for index.html to root, and from non www to www and removed the last .html from the last list of _ to - redirects. Give it a shot, and keep that backup handy just in case. If no go, maybe one of the htaccess experts around can step in and have a look, I am not 100% sure what some of those other rules are doing to be honest!
<ifmodule pagespeed_module="">ModPagespeed on
ModPagespeedEnableFilters extend_cache,combine_css, collapse_whitespace,move_css_to_head, remove_comments</ifmodule><ifmodule mod_headers.c="">Header set Connection keep-alive</ifmodule>
AddCharset UTF-8 .html
<filesmatch ".(js|css|html|htm|php|xml|swf|flv|ashx)$"="">
#SetOutputFilter DEFLATE
#</filesmatch><ifmodule mod_expires.c="">ExpiresActive On
ExpiresByType image/gif "access plus 1 years"
ExpiresByType image/jpeg "access plus 1 years"
ExpiresByType image/png "access plus 1 years"
ExpiresByType image/x-icon "access plus 1 years"
ExpiresByType image/jpg "access plus 1 years"
ExpiresByType text/css "access 1 years"
ExpiresByType text/x-javascript "access 1 years"
ExpiresByType application/javascript "access 1 years"
ExpiresByType image/x-icon "access 1 years"</ifmodule><files 403.shtml="">order allow,deny allow from all</files>
# mod_rewrite On only needed once
RewriteEngine On301 permanent redirect old underscore.html to new dash urls
redirect 301 /PRODUCTS http://www.timports.co.uk/kiln-dried-logs
redirect 301 /kindling_firewood.html http://www.timports.co.uk/kindling-firewood
redirect 301 /about_us.html http://www.timports.co.uk/about-us
redirect 301 /log_delivery.html http://www.timports.co.uk/log-delivery
redirect 301 /oak_boards_delivery.html http://www.timports.co.uk/oak-boards-delivery
redirect 301 /un_edged_oak_boards.html http://www.timports.co.uk/un-edged-oak-boards
redirect 301 /wholesale_logs.html http://www.timports.co.uk/wholesale-logs
redirect 301 /privacy_policy.html http://www.timports.co.uk/privacy-policy
redirect 301 /payment_failed.html http://www.timports.co.uk/payment-failed
redirect 301 /payment_info.html http://www.timports.co.uk/payment-info301 permanent redirect index.html to folder
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)index.html?\ HTTP/
RewriteRule ^(([^/]+/))index.html?$ http://www.timports.co.uk/$1 [R=301,L]301 permanent redirect non-www to www
RewriteCond %{HTTP_HOST} !^(www.timports.co.uk)?$
RewriteRule (.*) http://www.timports.co.uk/$1 [R=301,L]301 permanent redirect all .html to non .html
RewriteCond %{THE_REQUEST} .html
RewriteRule ^(.*).html$ /$1 [R=301,L] -
thanks Lyn, but that gave an 500 internal error, back up worked though
-
Hi,
I think you will only need this bit:
#301 from example.com/page.html to example.com/page
RewriteCond%{THE_REQUEST}^[A-Z]{3,9}\ /..html\ HTTP/
RewriteRule^(.).html$ /$1 [R=301,L]And you would replace this bit below with the above:
Rewrite valid requests on .html files RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html?rw=1 [L,QSA]
#Return 404 on direct requests against .html files
RewriteCond %{REQUEST_URI} .html$
RewriteCond %{QUERY_STRING} !rw=1 [NC]
RewriteRule ^ - [R=404]But leave the at the end of that section.
htaccess files can be a bit picky, so be sure to keep a backup so you can quickly undo something if it is not working!
-
Ok have got links to work again with old code, going to try this
#example.com/page will display the contents of example.com/page.html RewriteCond%{REQUEST_FILENAME}!-f RewriteCond%{REQUEST_FILENAME}!-d RewriteCond%{REQUEST_FILENAME}.html -f RewriteRule^(.+)$ $1.html [L,QSA] #301 from example.com/page.html to example.com/page RewriteCond%{THE_REQUEST}^[A-Z]{3,9}\ /..html\ HTTP/ RewriteRule^(.).html$ /$1 [R=301,L]
where would I put this code in relation to what I already have in my htaccess file
-
Thanks you for your reply, I have looked at the links you provided and tried replacing this RewriteEngine On #
Rewrite valid requests on .html files RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html?rw=1 [L,QSA]
Return 404 on direct requests against .html files
RewriteCond %{REQUEST_URI} .html$
RewriteCond %{QUERY_STRING} !rw=1 [NC]
RewriteRule ^ - [R=404]with this, but it didn't work or I did something wrong. #example.com/page will display the contents of example.com/page.html RewriteCond%{REQUEST_FILENAME}!-f RewriteCond%{REQUEST_FILENAME}!-d RewriteCond%{REQUEST_FILENAME}.html -f RewriteRule^(.+)$ $1.html [L,QSA] #301 from example.com/page.html to example.com/page RewriteCond%{THE_REQUEST}^[A-Z]{3,9}\ /..html\ HTTP/ RewriteRule^(.).html$ /$1 [R=301,L]
Now www.timports.co.uk says this page cant be displayed so I tried to put it back to the previous .htaccess and still no links working
I am so stuck
-
Hi,
Indeed there seems to be an issue with your redirects since the .html versions are still available on your site. Two things to check in the first instance:
1. The redirect line for the .html to non .html versions:
Rewrite valid requests on .html files RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html?rw=1 [L,QSA]
I am not sure if this will work the way you want it. First of all a # at the beginning of this line means it is a comment and not processed so you seem to have the RewriteCond part of the statement as a comment (maybe this is just the forum formatting it wrong, but good to check).
You can check some other solutions for redirecting .html to non .html here: http://stackoverflow.com/questions/5730092/how-to-remove-html-from-url2. At the bottom of the file you have a bunch of 301 redirects like this:
redirect 301 /kindling_firewood.html http://www.timports.co.uk/kindling-firewood.html
Which are working as expected redirecting underscored urls to urls with dashes. But they are also redirecting to the .html version which means you will be getting into double redirects which is pointless in your case. Once you have the non .html redirects working as expected you should adjust these 301s to go to the non .html version like so:
redirect 301 /kindling_firewood.html http://www.timports.co.uk/kindling-firewood
Hope that helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the difference between 301 redirects and backlinks?
i have seen some 301 redirects on my site billsonline, can anyone please explain the difference between backlinks and 301 redirects, i have read some articles where the writer was stating that 301 are not good for website.
Technical SEO | | aliho0 -
Google is still indexing the old domain a year after 301 redirects are put in place
Hi there, You might have experienced this before but for me this is the first. A client of mine moved from domain A (www.domainA.com) to domain B (www.domainB.com). 301 redirects are all in place for over a year. But the old domain is still showing in Google when you search for "site:domainA.com" The HTTP Header check shows this result for the URL https://www.domainA.com/company/cookie-policy.aspx HTTP/1.1 301 Moved Permanently =>
Technical SEO | | iQi
Cache-Control => private
Content-Length => 174
Content-Type => text/html; charset=utf-8
Location => https://www.domain_B_.com/legal/cookie-policy
Server => Microsoft-IIS/10.0
X-AspNetMvc-Version => 5.2
X-AspNet-Version => 4.0.30319
X-Powered-By => ASP.NET
Date => Fri, 15 Mar 2019 12:01:33 GMT
Connection => close Does the redirect look wrong? The change of address request was made on Google Console when the website was moved over a year ago. Edit: Checked the domainA.com on bing and it seems that its not indexed, and replaced with domainB.com, which is the right. Just Google is indexing the old domain! Please let me know your thoughts on why this is happening. Best,0 -
301 Redirects, Sitemaps and Indexing - How to hide redirected urls from search engines?
We have several pages in our site like this one, http://www.spectralink.com/solutions, which redirect to deeper page, http://www.spectralink.com/solutions/work-smarter-not-harder. Both urls are listed in the sitemap and both pages are being indexed. Should we remove those redirecting pages from the site map? Should we prevent the redirecting url from being indexed? If so, what's the best way to do that?
Technical SEO | | HeroDesignStudio0 -
Redirect URLS with 301 twice
Hello, I had asked my client to ask her web developer to move to a more simplified URL structure. There was a folder called "home" after the root which served no purpose. I asked for the URLs to be redirected using 301 to the new URLs which did not have this structure. However, the web developer didn't agree and decided to just rename the "home" folder "p". I don't know why he did this. We argued the case and he then created the URL structure we wanted. Initially he had 301 redirected the old URLS (the one with "Home") to his new version (the one with the "p"). When we asked for the more simplified URL after arguing, he just redirected all the "p" URLS to the PAGE NOT FOUND. However, remember, all the original URLs are now being redirected to the PAGE NOT FOUND as a result. The problems I see are these unless he redirects again: The new simplified URLS have to start from scratch to rank 2)We have duplicated content - two URLs with the same content Customers clicking products in the SERPs will currently find that they are being redirect to the 404 page. I understand that redirection has to occur but my questions are these: Is it ok to redirect twice with 301 - so old URL to the "p" version then to final simplified version. Will link juice be lost doing this twice? If he redirects from the original URLS to the final version missing out the "p" version, what should happen to the "p" version - they are currently indexed. Any help would be appreciated. Thanks
Technical SEO | | AL123al0 -
How do I fix a 301 Redirect Loop?
Saturday I waas doing some correcting of some duplicate titles, including nofollowing tags, etc. (my main problem was duplicate titles due to tags and categories being indexed). Now this morning I see that one of my pages refuses to load, citing a 301 redirect loop. http://www.incredibleinfant.com/feeding/switching-baby-formula/ Originally, the page was posted under the wrong category. http://www.incredibleinfant.com/uncategorized/switching-baby-formula I resaved it under the correct category (feeding) and now it won't load. Can someone help me figure out how to correct this mess? Thanks so much Heather
Technical SEO | | Gotmoxie0 -
Switching from a .org to .io (301 domain redirect)
I'm considering switching my main site from a .org to .io address; the .org is an exact match domain which helped to kickstart it a few years ago and now has about 50% repeat visitors, but was thrown off the Apple affiliation program for trademark infringement. I've found and purchased a nice (non-infringing) .io domain, and I've read the advice here on how to properly 301 the old domain; but my question is - does it matter that it's .io? Is this going to significantly hurt my rankings, even when everything has been 301'd properly? Another thought I had is that I may actually come out better off in the long run, what with Google penalties being applied to exact match domains. Is this a ranking suicide? If so, I'm tempted to leave it as is; even without the affiliation, it's making a good amount every month in ad fees that I don't want to disrupt. Thanks all!
Technical SEO | | w0lfiesmithUK0 -
Where does Wordpress store the 301 redirects?
Hi, I've just created a campaign for my new wordpress blog and found 11 301 redirects which I was not aware of. It looks like wordpress has created them automatically. Does any one know how wordpress handles this issues or where are they stored so I can delete them? They are of no use for me. 9 of these redirects point to the same url with an added '/' and are in pages 1 is on a post. I've been changing the permalink and some urls several times and maybe one of these times the Wordpress has automatically created the 301 redirect. But why? I do not want to keep the old url. the last redirect is very strange it goes from http://www.mydomain.com/folder to http://www.mydomain.com where folder is the folder where I installed wordpress. But again, I want no one to type the url with the folder name or even know this folder exists. Any comment on this would be greatly appreciated. Thanks a lot, David
Technical SEO | | dballari0 -
Multiple Domains, Same IP address, redirecting to preferred domain (301) -site is still indexed under wrong domains
Due to acquisitions over time and the merging of many microsites into one major site, we currently have 20+ TLD's pointing to the same IP address as our "preferred domain:" for our consolidated website http://goo.gl/gH33w. They are all set up as 301 redirects on apache - including both the www and non www versions. When we launched this consolidated website, (April 2010) we accidentally left the settings of our site open to accept any of our domains on the same IP. This was later fixed but unfortunately Google indexed our site under multiple of these URL's (ignoring the redirects) using the same content from our main website but swapping out the domain. We added some additional redirects on apache to redirect these individual pages pages indexed under the wrong domain to the same page under our main domain http://goo.gl/gH33w. This seemed to help resolve the issue and moved hundreds of pages off the index. However, in December of 2010 we made significant changes in our external dns for our ip addresses and now since December, we see pages indexed under these redirecting domains on the rise again. If you do a search query of : site:laboratoryid.com you will see a few hundred examples of pages indexed under the wrong domain. When you click on the link, it does redirect to the same page but under the preferred domain. So the redirect is working and has been confirmed as 301. But for some reason Google continues to crawl our site and index under this incorrect domains. Why is this? Is there a setting we are missing? These domain level and page level redirects should be decreasing the pages being indexed under the wrong domain but it appears it is doing the reverse. All of these old domains currently point to our production IP address where are preferred domain is also pointing. Could this be the issue? None of the pages indexed today are from the old version of these sites. They only seem to be the new content from the new site but not under the preferred domain. Any insight would be much appreciated because we have tried many things without success to get this resolved.
Technical SEO | | sboelter0