Redirecting all URLs appended with index.htm or index.html
-
It has come to my attention with one of my clients (WordPress website) that for some time they have within their Landing Page report (of GA - Google Analytics) URLs that should all be pointing to the one page, example:
domain.com/about-us, also has a listing in GA as domain.com/about-us/index.htm
Is this some kind of indication of a subdirectory issue? Has anyone had experience with this in such wordpress plugins as Yoast SEO, or other SEO plugin?
My thoughts here are to simply redirect any of these non-existent files with a redirect in .htaccess - but what I'm using isn't working. I will insert the redirect here - - and any help would be greatly appreciated.
RewriteEngine onRewriteCond %{THE_REQUEST} ^./index.html?
RewriteRule ^(.)index.html?$ http://www.dupontservicecenter.com/$1 [R=301,L]and this rewrite doesn't work:
RewriteEngine on
RewriteRule ^(.+).htm$ http://dupontservicecenter.com/$1.php [R,NC]_Cindy
-
ThompsonPaul,
Thank you! I've looked at that feature so many times, and read and reread the info Google provided, and clearly reading this information literally, as someone at my level would, it really doesn't specify whether adding the default page "adds" index.htm(l) to the url and therefore combines all "same-pages" or if it removes it to combine "same-pages"
-- and I assumed the later since that is what happens with permalinks in WP... go figure. Now I realize it adds. Also it didn't occur to me that this feature wouldn't act as a filter would and you would see the results right away.
OK so I have removed "index.htm" from the default page field, it is all clear now. Additionally I am also showing appended to my url's an "index.html" -- and this is in addition the actual url. So I am seeing, for example:
/about-us/ /about-us/index.htm and in some cases urls like /about-us/index.html.
I can only guess that at one time both of these default urls were in the default page setting... "index.html" and "index.htm" And anyway these pages with index.htm(l) do not exist, ...which would explain that right, likely this issue concerns settings in GA
-
So one more perplexing issue - in the search console landing page report I am showing 0 hits for any url appended with either index.htm or index.html.
-
But in the regular reporting of landing pages, and also custom reporting, these pages are showing hits (pages appended w index.htm(l)). What could cause this discrepancy?
-
As you suggestion it would take a bit of filtering to clean up these url's in Google Analytics? And so if it is in Google Analytics then any redirect in the htaccess file is for naught?
-
So a several weeks, likely for this small business site, to begin showing clean urls and to see if this is actually this issue?
Thank you so very much!
_Cindy
-
-
Thomas, thank you for your help. I did occur to me that perhaps the order of items in the htaccess file may be the issue.
I am going to look into this issue - thanks to your suggestion, and then see if my redirects are working as they should.
When I do, I'll get back to you on this topic.
Now, I'm trying to wrap my mind around the issue of why "index.htm and index html" when my site is WP based and therefore a PHP framework. ThompsonPaul has responded with what was my next look (and actually a 4th to 8th look) concerning the default page setting in GA.
Thanks again.
_Cindy -
Cindy, this is almost certainly an issue with the way your Google Analytics is configured, not your WP site. (the fact the "index.htm" comes after a "/" is the clue.
If you check the View Settings link under the View in the Admin section of your dashboard, you'll find a field called Default Page. For most correctly configured modern sites (WP sites included), this field must be empty for GA to be configured correctly. I'm betting your config has index.htm entered in that field. [See screenshot below.]
Once you remove that entry, your data will avoid the problem going forward, but it will take some work with custom filters if you want to try to clean up the historical data.
Let me know if that solves the issue?
Paul
-
Are you able to copy out your whole htaccess?
I've got to admit, i'm not the best with it but I'll try and help you figure this out
-
Hi Thomas,
Very much appreciate your reponse.
So far none of the redirects are working, including your suggestion. So I tested the htaccess file with this redirect, changing one of the redirects already listed in the htaccess file for some time now, which use to work...
RewriteCond %{HTTP_HOST} ^dupontservicecenter.com/buying-and-selling$
RewriteRule ^$ http://dupontservicecenter.com/rewards/auto-service-credit [L,R=301]...not working, is redirecting to the old url, the one I changed.
I have purged cache (using litespeed cache for wp since I'm on a litespeed server these days). Could it be a purge issue? What would cause the htaccess file not to work properly?
The only redirect that is working is through a plugin for wp - quick redirects which uses the wp_redirect() function.
Totally lost in a haystack.
Any further suggestions would be helpful, otherwise, a complete, timely, breakdown of all website components will have to be proposed to the client.
_Cindy
-
https://moz.com/community/q/redirecting-index-html-to-the-root
StreamlineMetrics:
If you want to redirect all index.html(s) to their roots, then try this code -
RewriteEngine On
RewriteRule ^index.html$ / [R=301,L]
RewriteRule ^(.*)/index.html$ /$1/ [R=301,L]And yes, Google will treat them as 301 redirects so your juice will be transferred and consolidated.
Obviously, change index.html to index.htm
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
PDF best practices: to get them indexed or not? Do they pass SEO value to the site?
All PDFs have landing pages, and the pages are already indexed. If we allow the PDFs to get indexed, then they'd be downloadable directly from google's results page and we would not get GA events. The PDFs info would somewhat overlap with the landing pages info. Also, if we ever need to move content, we'd now have to redirects the links to the PDFs. What are best practices in this area? To index or not? What do you / your clients do and why? Would a PDF indexed by google and downloaded directly via a link in the SER page pass SEO juice to the domain? What if it's on a subdomain, like when hosted by Pardot? (www1.example.com)
Reporting & Analytics | | hlwebdev1 -
How can I remove parameters from the GSC URL blocking tool?
Hello Mozzers My client's previous SEO company went ahead and blindly blocked a number of parameters using the GSC URL blocking tool. This has now caused Google to stop crawling many pages on my client's website and I am not sure how to remove these blocked parameters so that they can be crawled and reindexed by Google. The crawl setting is set to "Let Google bot decide" but still there has been a drop in the number of pages being crawled. Can someone please share their experience and help me delete these blocked parameters from GSC's URL blocking tool. Thank you Mozzers!
Reporting & Analytics | | Vsood0 -
Not many pages being indexed on google
Hi I am putting in to Google: site:www.mysite.com to see the pages listed on Google - the figure Google is coming back with is much lower than the actual pages, I have no crawer warning etc... What could the problem be? Thanks
Reporting & Analytics | | acumenadagency0 -
Landing page URL appearing as keyword
Hi Mozers, I've recently experienced the URLs of my key landing pages coming up as keywords. This has been on the rise since early July (when it was relatively insignificant) to the current position (see image below) where they make up the majority of my top keywords. Drilling down into a bit more detail, this seems to be almost exclusively Desktop traffic but in terms of Technology there are no clear standouts (seems to be mostly Windows OS and Chrome). Has anyone else been experiencing this?
Reporting & Analytics | | mopland0 -
Regex Filter To Exlude lower case urls
Buon Pormeriggio from Wetherby 22 degrees the summer continues! I need to set up a regex filter to knock out lowecase versions of http://www.sandersonweatherall.co.uk/Sales/ Thing is Analytics is returning this lowercase version which i want to regex filter out.So if Regex filter /Sales/$ returns what i eant how do i knock out urls beginning with lowe case s. Grazie,
Reporting & Analytics | | Nightwing
David0 -
Reasons for drop in URLs Receiving Entrances Via Search
Hi I'm having trouble understanding why I'm getting the results I am for my organic traffic data. I've been focussing on a few keywords throughout my website and the most recent results show that there is a big increase in the Organic Search Visits and the Non-Paid Keywords Sending Search Visits for both Branded Keywords and Non-branded Keywords, but the results for URLs Receiving Entrances Via Search are the complete opposite. Down by a few percent. I don't understand why this would happen and was hoping that someone could maybe explain and give a few reasons for why this is happening and maybe give some tips on how to stop it from happening in the future if possible. Thanks.
Reporting & Analytics | | Bonx0 -
How to change a url for google analytics account
We recently changed the url of a client's website. Is there a way to change the url on the ga account instead of creating a new account so that we don't lose comparative data? Thanks! Sorry- I know this is a novice question.
Reporting & Analytics | | marketing12340