Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Redirecting all URLs appended with index.htm or index.html
-
It has come to my attention with one of my clients (WordPress website) that for some time they have within their Landing Page report (of GA - Google Analytics) URLs that should all be pointing to the one page, example:
domain.com/about-us, also has a listing in GA as domain.com/about-us/index.htm
Is this some kind of indication of a subdirectory issue? Has anyone had experience with this in such wordpress plugins as Yoast SEO, or other SEO plugin?
My thoughts here are to simply redirect any of these non-existent files with a redirect in .htaccess - but what I'm using isn't working. I will insert the redirect here - - and any help would be greatly appreciated.
RewriteEngine onRewriteCond %{THE_REQUEST} ^./index.html?
RewriteRule ^(.)index.html?$ http://www.dupontservicecenter.com/$1 [R=301,L]and this rewrite doesn't work:
RewriteEngine on
RewriteRule ^(.+).htm$ http://dupontservicecenter.com/$1.php [R,NC]_Cindy
-
ThompsonPaul,
Thank you! I've looked at that feature so many times, and read and reread the info Google provided, and clearly reading this information literally, as someone at my level would, it really doesn't specify whether adding the default page "adds" index.htm(l) to the url and therefore combines all "same-pages" or if it removes it to combine "same-pages"
-- and I assumed the later since that is what happens with permalinks in WP... go figure. Now I realize it adds. Also it didn't occur to me that this feature wouldn't act as a filter would and you would see the results right away.
OK so I have removed "index.htm" from the default page field, it is all clear now. Additionally I am also showing appended to my url's an "index.html" -- and this is in addition the actual url. So I am seeing, for example:
/about-us/ /about-us/index.htm and in some cases urls like /about-us/index.html.
I can only guess that at one time both of these default urls were in the default page setting... "index.html" and "index.htm" And anyway these pages with index.htm(l) do not exist, ...which would explain that right, likely this issue concerns settings in GA
-
So one more perplexing issue - in the search console landing page report I am showing 0 hits for any url appended with either index.htm or index.html.
-
But in the regular reporting of landing pages, and also custom reporting, these pages are showing hits (pages appended w index.htm(l)). What could cause this discrepancy?
-
As you suggestion it would take a bit of filtering to clean up these url's in Google Analytics? And so if it is in Google Analytics then any redirect in the htaccess file is for naught?
-
So a several weeks, likely for this small business site, to begin showing clean urls and to see if this is actually this issue?
Thank you so very much!
_Cindy
-
-
Thomas, thank you for your help. I did occur to me that perhaps the order of items in the htaccess file may be the issue.
I am going to look into this issue - thanks to your suggestion, and then see if my redirects are working as they should.
When I do, I'll get back to you on this topic.
Now, I'm trying to wrap my mind around the issue of why "index.htm and index html" when my site is WP based and therefore a PHP framework. ThompsonPaul has responded with what was my next look (and actually a 4th to 8th look) concerning the default page setting in GA.
Thanks again.
_Cindy -
Cindy, this is almost certainly an issue with the way your Google Analytics is configured, not your WP site. (the fact the "index.htm" comes after a "/" is the clue.
If you check the View Settings link under the View in the Admin section of your dashboard, you'll find a field called Default Page. For most correctly configured modern sites (WP sites included), this field must be empty for GA to be configured correctly. I'm betting your config has index.htm entered in that field. [See screenshot below.]
Once you remove that entry, your data will avoid the problem going forward, but it will take some work with custom filters if you want to try to clean up the historical data.
Let me know if that solves the issue?
Paul
-
Are you able to copy out your whole htaccess?
I've got to admit, i'm not the best with it but I'll try and help you figure this out
-
Hi Thomas,
Very much appreciate your reponse.
So far none of the redirects are working, including your suggestion. So I tested the htaccess file with this redirect, changing one of the redirects already listed in the htaccess file for some time now, which use to work...
RewriteCond %{HTTP_HOST} ^dupontservicecenter.com/buying-and-selling$
RewriteRule ^$ http://dupontservicecenter.com/rewards/auto-service-credit [L,R=301]...not working, is redirecting to the old url, the one I changed.
I have purged cache (using litespeed cache for wp since I'm on a litespeed server these days). Could it be a purge issue? What would cause the htaccess file not to work properly?
The only redirect that is working is through a plugin for wp - quick redirects which uses the wp_redirect() function.
Totally lost in a haystack.
Any further suggestions would be helpful, otherwise, a complete, timely, breakdown of all website components will have to be proposed to the client.
_Cindy
-
https://moz.com/community/q/redirecting-index-html-to-the-root
StreamlineMetrics:
If you want to redirect all index.html(s) to their roots, then try this code -
RewriteEngine On
RewriteRule ^index.html$ / [R=301,L]
RewriteRule ^(.*)/index.html$ /$1/ [R=301,L]And yes, Google will treat them as 301 redirects so your juice will be transferred and consolidated.
Obviously, change index.html to index.htm
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
"index.htm" for all url's in google analytics
I don't have this issue with other wordpress websites, only this one website, and I don't know what's causing the issue: Google Analytics is adding an "index.htm" to every single page on the website. So it is tracking the pages, I see no errors - is it tracking the right page? When I click on the page link in a report, I naturally go to a "404 page not found" since the website address isn't "www.example.com/rewards/index.htm" - but instead the actual address would be:
Reporting & Analytics | | cceebar
"www.example.com/rewards/". I have navigated to View Settings in GA to insure "default page" is empty. Although adding anything else to this field does not effect the page url in analytics reports either. Could it be htaccess file - or a plugin effecting the htaccess file?_Cindy0 -
No-indexed pages are still showing up as landing pages in Google Analytics
Hello, My website is a local job board. I de-indexed all of the job listing pages on my site (anything that starts with http://www.localwisejobs.com/job/). When I search site:localwisejobs.com/job/, nothing shows up. So I think that means the pages are not being indexed. When I look in Google Analytics at Acquisition > Search Engine Optimization > Landing Pages, none of the job listing pages show up. But when I look at Acquisition > Channels > Organic and then click Landing Page as the primary dimension, the /job pages show up in there. Why am I seeing this discrepency in Organic Landing pages? And why would the /job pages be showing up as landing pages even though they aren't indexed?
Reporting & Analytics | | mztobias0 -
Getting google impressions for a site not in the index...
Hi all Wondering if i could pick the brains of those wise than myself... my client has an https website with tons of pages indexed and all ranking well, however somehow they managed to also set their server up so that non https versions of the pages were getting indexed and thus we had the same page indexed twice in the engine but on slightly different urls (it uses a cms so all the internal links are relative too). The non https is mainly used as a dev testing environment. Upon seeing this we did a google remove request in WMT, and added noindex in the robots and that saw the index pages drop over night. See image 1. However, the site still appears to getting return for a couple of 100 searches a day! The main site gets about 25,000 impressions so it's way down but i'm puzzled as to how a site which has been blocked can appear for that many searches and if we are still liable for duplicate content issues. Any thoughts are most welcome. Sorry, I am unable to share the site name i'm afraid. Client is very strict on this. Thanks, Carl image1.png
Reporting & Analytics | | carl_daedricdigital0 -
How to safely exclude search result pages from Google's index?
Hello everyone,
Reporting & Analytics | | llamb
I'm wondering what's the best way to prevent/block search result pages from being indexed by Google. The way search works on my site is that search form generates URLs like:
/index.php?blah-blah-search-results-blah I wanted to block everything of that sort, but how do I do it without blocking /index.php ? Thanks in advance and have a great day everyone!0 -
Will 301 redirects (Same Domain) show as referral traffic in Analytics?
For an eCommerce site we have 301'd legacy product pages to new product pages. Is all that traffic going to show up as referral traffic from our own domain in Google Analytics? If so, is there any way to preserve original source/medium info or will all the source/medium info be our own domain since there is a 301 redirect?
Reporting & Analytics | | bozzie3111 -
Google Analytics Organic Search Keywords Suddenly Displaying FulL Urls
In my Google Analytics, the top keywords for Organic Search are suddenyl displaying full URLs. For example, now the third and fourth keywords are http://www.domain.com/highly-specific-URL. These have all started recently around the same day, July 12th. I've checked back, and we've made no internal changes to the site around that time that could affect this. Any thoughts on this? Thanks! P.S. It might be related to rich snippets, but I cannot tell at this point.
Reporting & Analytics | | 10SL0 -
Setting up Google Analytics default URL
If someone has set: the default url in Google Analytics to a non-www address (http://mysite.com) then placed the UA tracking script from that GA account within the CMS framework of the website... ... and then set the permanent 301 redirect in the htaccess file to redirect to the www address (http://www.mysite.com). How less accurrate will my GA analytics measurements be considering the default url within GA is non-www and the permanent 301 redirect in htacess is to the www-address? Anyone know how reliable GA reports are until the default url in GA analytics is changed to match what is the redirected url in htaccess file? _Cindy
Reporting & Analytics | | CeCeBar0 -
Why are Seemingly Randomly Generated URLs Appearing as Errors in Google Webmaster Tools?
I've been confused by some URLs that are showing up as errors in our GWT account. They seem to just be randomly generated alphanumeric strings that Google is reporting as 404 errors. The pages do 404 because nothing ever existed there or was linked to. Here are some examples that are just off of our root domain: /JEzjLs2wBR0D6wILPy0RCkM/WFRnUK9JrDyRoVCnR8= /MevaBpcKoXnbHJpoTI5P42QPmQpjEPBlYffwY8Mc5I= /YAKM15iU846X/ymikGEPsdq 26PUoIYSwfb8 FBh34= I haven't been able to track down these character strings in any internet index or anywhere in our source code so I have no idea why Google is reporting them. We've been pretty vigilant lately about duplicate content and thin content issues and my concern is that there are an unspecified number of urls like this that Google thinks exist but don't really. Has anyone else seen GWT reporting errors like this for their site? Does anyone have any clue why Google would report them as errors?
Reporting & Analytics | | kimwetter0