Redirecting all URLs appended with index.htm or index.html
-
It has come to my attention with one of my clients (WordPress website) that for some time they have within their Landing Page report (of GA - Google Analytics) URLs that should all be pointing to the one page, example:
domain.com/about-us, also has a listing in GA as domain.com/about-us/index.htm
Is this some kind of indication of a subdirectory issue? Has anyone had experience with this in such wordpress plugins as Yoast SEO, or other SEO plugin?
My thoughts here are to simply redirect any of these non-existent files with a redirect in .htaccess - but what I'm using isn't working. I will insert the redirect here - - and any help would be greatly appreciated.
RewriteEngine onRewriteCond %{THE_REQUEST} ^./index.html?
RewriteRule ^(.)index.html?$ http://www.dupontservicecenter.com/$1 [R=301,L]and this rewrite doesn't work:
RewriteEngine on
RewriteRule ^(.+).htm$ http://dupontservicecenter.com/$1.php [R,NC]_Cindy
-
ThompsonPaul,
Thank you! I've looked at that feature so many times, and read and reread the info Google provided, and clearly reading this information literally, as someone at my level would, it really doesn't specify whether adding the default page "adds" index.htm(l) to the url and therefore combines all "same-pages" or if it removes it to combine "same-pages"
-- and I assumed the later since that is what happens with permalinks in WP... go figure. Now I realize it adds. Also it didn't occur to me that this feature wouldn't act as a filter would and you would see the results right away.
OK so I have removed "index.htm" from the default page field, it is all clear now. Additionally I am also showing appended to my url's an "index.html" -- and this is in addition the actual url. So I am seeing, for example:
/about-us/ /about-us/index.htm and in some cases urls like /about-us/index.html.
I can only guess that at one time both of these default urls were in the default page setting... "index.html" and "index.htm" And anyway these pages with index.htm(l) do not exist, ...which would explain that right, likely this issue concerns settings in GA
-
So one more perplexing issue - in the search console landing page report I am showing 0 hits for any url appended with either index.htm or index.html.
-
But in the regular reporting of landing pages, and also custom reporting, these pages are showing hits (pages appended w index.htm(l)). What could cause this discrepancy?
-
As you suggestion it would take a bit of filtering to clean up these url's in Google Analytics? And so if it is in Google Analytics then any redirect in the htaccess file is for naught?
-
So a several weeks, likely for this small business site, to begin showing clean urls and to see if this is actually this issue?
Thank you so very much!
_Cindy
-
-
Thomas, thank you for your help. I did occur to me that perhaps the order of items in the htaccess file may be the issue.
I am going to look into this issue - thanks to your suggestion, and then see if my redirects are working as they should.
When I do, I'll get back to you on this topic.
Now, I'm trying to wrap my mind around the issue of why "index.htm and index html" when my site is WP based and therefore a PHP framework. ThompsonPaul has responded with what was my next look (and actually a 4th to 8th look) concerning the default page setting in GA.
Thanks again.
_Cindy -
Cindy, this is almost certainly an issue with the way your Google Analytics is configured, not your WP site. (the fact the "index.htm" comes after a "/" is the clue.
If you check the View Settings link under the View in the Admin section of your dashboard, you'll find a field called Default Page. For most correctly configured modern sites (WP sites included), this field must be empty for GA to be configured correctly. I'm betting your config has index.htm entered in that field. [See screenshot below.]
Once you remove that entry, your data will avoid the problem going forward, but it will take some work with custom filters if you want to try to clean up the historical data.
Let me know if that solves the issue?
Paul
-
Are you able to copy out your whole htaccess?
I've got to admit, i'm not the best with it but I'll try and help you figure this out
-
Hi Thomas,
Very much appreciate your reponse.
So far none of the redirects are working, including your suggestion. So I tested the htaccess file with this redirect, changing one of the redirects already listed in the htaccess file for some time now, which use to work...
RewriteCond %{HTTP_HOST} ^dupontservicecenter.com/buying-and-selling$
RewriteRule ^$ http://dupontservicecenter.com/rewards/auto-service-credit [L,R=301]...not working, is redirecting to the old url, the one I changed.
I have purged cache (using litespeed cache for wp since I'm on a litespeed server these days). Could it be a purge issue? What would cause the htaccess file not to work properly?
The only redirect that is working is through a plugin for wp - quick redirects which uses the wp_redirect() function.
Totally lost in a haystack.
Any further suggestions would be helpful, otherwise, a complete, timely, breakdown of all website components will have to be proposed to the client.
_Cindy
-
https://moz.com/community/q/redirecting-index-html-to-the-root
StreamlineMetrics:
If you want to redirect all index.html(s) to their roots, then try this code -
RewriteEngine On
RewriteRule ^index.html$ / [R=301,L]
RewriteRule ^(.*)/index.html$ /$1/ [R=301,L]And yes, Google will treat them as 301 redirects so your juice will be transferred and consolidated.
Obviously, change index.html to index.htm
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Expired domain 301 Redirection Strategy of Competitors
I have one simple question about expired domain redirection strategy. I am 100% sure that one of my competitors are using this strategy to rank the website. Is there any tool or method to find out how many domains competitor has redirected to his/her website? Your answer is highly appreciated. Thanks
Reporting & Analytics | | gfdd4200 -
Curious, anyone ever had over half of their indexed links drop on an e-commerce site?
In a year went from around 300k indexed pages to around >100k according to GWT. Could this be duplicate content issue, lost links, spam, aged links or all of the above? either way an audit is in order. Thanks! Chris
Reporting & Analytics | | Sundance_Kidd0 -
Direct traffic coming to URLs with /rss_feedIP#
I'm doing a site audit for an organization that has a bunch of really messy old Drupal sites. In looking at their traffic, I see that a majority of it is coming to landing pages that look like this: http://clientsdomain.com/rss_feed173.8.208.97 plus other IP addresses. The bounce rate is 100% and time on site is less than a second. It looks like something that an RSS feed tool might use, but I've never seen something like it before. It creates its own landing page, hits the site, then appears to bounce. This is making their Analytics data look a whole lot worse than the site is actually doing, since the bounce rate is 100% on all that fake traffic. I have some experience with Drupal, but I've never seen anything like this in Drupal or any other CMS. Has anyone out there ever experienced something like this, where direct traffic comes to an rss feed landing page and bounces immediately?
Reporting & Analytics | | newwhy0 -
High Temporary Redirects: Login required pages
Noticed something interesting, a high temporary redirect report from Moz. Reviewing the pages they are caused by the user having to login and getting redirected. I can see the returnto query in the URL too. My thoughts: Since a login is required and the user is being redirected, these should remain 302 and not 301. I tested my Google Analytics account to **Exclude URL Query Parameter **returnto, just to see if it affected traffic. It didn't, I mean I don't see urls duplicated with the parameter anymore, just grouped together, so traffic is still being counted. I'm going to wait 1 more day and see what impact the GA traffic is before applying the exclusion to my true Google Analytics profile. This got me thinking, I should probably exclude this parameter from Google and Bing Webmaster Tools, that way Google/bing won't read those urls. Now does Moz's crawler follow that? Do you think that would change my moz crawl diagnostic report because I told Google/Bing crawlers to exclude that parameter. What do you think of my approach to reduce these high temporary redirects reported by Moz? Will it work? Has it plagued you?
Reporting & Analytics | | Bio-RadAbs0 -
Difference between site: search and Total Indexed in Google Webmaster Tools.
This morning I did a search on Google for my site using the site: operator. I noticed that the number of results returned was significantly different than the "Total indexed" in Google Webmaster Tools. What is the difference and is it normal to have two very different numbers here?
Reporting & Analytics | | Gordian0 -
Bing Won't Index Site - Help!
For the past few weeks I’ve been trying to figure out why my client's site is not indexed on bing and yahoo search engines. My Google analytics is telling me I’m getting traffic (very little traffic) from Bing almost daily but Bing webmaster tools is telling me I’ve received no traffic and no pages have been indexed into Bing since the beginning of December. At once point I was showing ranking in Bing for only one keyword then all of a sudden none of my pages were being indexed and I now rank for nothing for that website. From Google I’m getting over 1200 visits per month. I have been doing everything I can to possibly find the culprit behind this issue. I feel like the issue could be a redirect problem. In webmaster tools on Bing I’ve used “Fetch as Bingbot” and every time I use it I get a Status of “Redirection limit reached.”. I also checked the CRAWL Information and it’s saying all the URL’s to the site are under 301 redirect. A month or so ago the site was completely revamped and the canonical URL was changed from non www to www. I have tried manually adding pages to be indexed multiple times and Bing will not index any of the sites pages. I have submitted the sitemap to Bing and I am now at a loss. I don’t know what’s going on and why I can’t get the site listed on Bing. Any suggestions would be greatly appreciated. Thanks,
Reporting & Analytics | | VITALBGS
Stephen0 -
We have detected that the root domain is not a live URL.
I'm trying to add a URL that is having some obvious issues so I can further investigate. When trying to add this site to a campaign in SEOmoz i get the following: Roger has detected a problem: We have detected that the root domain theurbandater.com is not a live URL. Using this domain, we will be unable to crawl your site or present accurate SERP information. == What does that error mean? Where should I be looking to begin troubleshooting? The initial issue was that back on 9/1 according to Google Webmaster Tools this site began getting a high number of 500 erros and that number continued to rise up to 3200 of the same type of error. So something screwy is going on and I'm not sure where to start looking.
Reporting & Analytics | | digisavvy0 -
Why do I have few different index URL addresses?
Yes I know, sorry guys but I also have a problem with duplicate pages. It shows that almost every page of my site has a duplicate content issue and looking at my folders in the server, I don't see all these pages... This is a static Website with no shopping cart or anything fancy. The first on the list is my [index] page and this is giving me a hint about some sort of bad settings on my end with the SEOMOZ crawler??? Please advice and thank you! index-variations.jpg
Reporting & Analytics | | cssyes0