Redirecting all URLs appended with index.htm or index.html
-
It has come to my attention with one of my clients (WordPress website) that for some time they have within their Landing Page report (of GA - Google Analytics) URLs that should all be pointing to the one page, example:
domain.com/about-us, also has a listing in GA as domain.com/about-us/index.htm
Is this some kind of indication of a subdirectory issue? Has anyone had experience with this in such wordpress plugins as Yoast SEO, or other SEO plugin?
My thoughts here are to simply redirect any of these non-existent files with a redirect in .htaccess - but what I'm using isn't working. I will insert the redirect here - - and any help would be greatly appreciated.
RewriteEngine onRewriteCond %{THE_REQUEST} ^./index.html?
RewriteRule ^(.)index.html?$ http://www.dupontservicecenter.com/$1 [R=301,L]and this rewrite doesn't work:
RewriteEngine on
RewriteRule ^(.+).htm$ http://dupontservicecenter.com/$1.php [R,NC]_Cindy
-
ThompsonPaul,
Thank you! I've looked at that feature so many times, and read and reread the info Google provided, and clearly reading this information literally, as someone at my level would, it really doesn't specify whether adding the default page "adds" index.htm(l) to the url and therefore combines all "same-pages" or if it removes it to combine "same-pages"
-- and I assumed the later since that is what happens with permalinks in WP... go figure. Now I realize it adds. Also it didn't occur to me that this feature wouldn't act as a filter would and you would see the results right away.
OK so I have removed "index.htm" from the default page field, it is all clear now. Additionally I am also showing appended to my url's an "index.html" -- and this is in addition the actual url. So I am seeing, for example:
/about-us/ /about-us/index.htm and in some cases urls like /about-us/index.html.
I can only guess that at one time both of these default urls were in the default page setting... "index.html" and "index.htm" And anyway these pages with index.htm(l) do not exist, ...which would explain that right, likely this issue concerns settings in GA
-
So one more perplexing issue - in the search console landing page report I am showing 0 hits for any url appended with either index.htm or index.html.
-
But in the regular reporting of landing pages, and also custom reporting, these pages are showing hits (pages appended w index.htm(l)). What could cause this discrepancy?
-
As you suggestion it would take a bit of filtering to clean up these url's in Google Analytics? And so if it is in Google Analytics then any redirect in the htaccess file is for naught?
-
So a several weeks, likely for this small business site, to begin showing clean urls and to see if this is actually this issue?
Thank you so very much!
_Cindy
-
-
Thomas, thank you for your help. I did occur to me that perhaps the order of items in the htaccess file may be the issue.
I am going to look into this issue - thanks to your suggestion, and then see if my redirects are working as they should.
When I do, I'll get back to you on this topic.
Now, I'm trying to wrap my mind around the issue of why "index.htm and index html" when my site is WP based and therefore a PHP framework. ThompsonPaul has responded with what was my next look (and actually a 4th to 8th look) concerning the default page setting in GA.
Thanks again.
_Cindy -
Cindy, this is almost certainly an issue with the way your Google Analytics is configured, not your WP site. (the fact the "index.htm" comes after a "/" is the clue.
If you check the View Settings link under the View in the Admin section of your dashboard, you'll find a field called Default Page. For most correctly configured modern sites (WP sites included), this field must be empty for GA to be configured correctly. I'm betting your config has index.htm entered in that field. [See screenshot below.]
Once you remove that entry, your data will avoid the problem going forward, but it will take some work with custom filters if you want to try to clean up the historical data.
Let me know if that solves the issue?
Paul
-
Are you able to copy out your whole htaccess?
I've got to admit, i'm not the best with it but I'll try and help you figure this out
-
Hi Thomas,
Very much appreciate your reponse.
So far none of the redirects are working, including your suggestion. So I tested the htaccess file with this redirect, changing one of the redirects already listed in the htaccess file for some time now, which use to work...
RewriteCond %{HTTP_HOST} ^dupontservicecenter.com/buying-and-selling$
RewriteRule ^$ http://dupontservicecenter.com/rewards/auto-service-credit [L,R=301]...not working, is redirecting to the old url, the one I changed.
I have purged cache (using litespeed cache for wp since I'm on a litespeed server these days). Could it be a purge issue? What would cause the htaccess file not to work properly?
The only redirect that is working is through a plugin for wp - quick redirects which uses the wp_redirect() function.
Totally lost in a haystack.
Any further suggestions would be helpful, otherwise, a complete, timely, breakdown of all website components will have to be proposed to the client.
_Cindy
-
https://moz.com/community/q/redirecting-index-html-to-the-root
StreamlineMetrics:
If you want to redirect all index.html(s) to their roots, then try this code -
RewriteEngine On
RewriteRule ^index.html$ / [R=301,L]
RewriteRule ^(.*)/index.html$ /$1/ [R=301,L]And yes, Google will treat them as 301 redirects so your juice will be transferred and consolidated.
Obviously, change index.html to index.htm
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moving an entire section of a website to a new URL
Hello there,
Reporting & Analytics | | CraigFairgrieve
I currently have a website that offers services to both domestic and business clients. Currently these are hosted on a single URL, then split out .com/business and .com/domestic We're going to be moving the .com/business structure and content to an entirely new URL which will be dedicated to only the business part of the business. This will also mean a change in the branding of the existing content as the new URL will come under the name of the parent company. My question is two fold: (1) What's the best way to go about this? (2) What would be the estimated effect on the traffic? Many thanks for your help in advance.0 -
How to Diagnose "Crawled - Currently Not Indexed" in Google Search Console
The new Google Search Console gives a ton of information about which pages were excluded and why, but one that I'm struggling with is "crawled - currently not indexed". I have some clients that have fallen into this pit and I've identified one reason why it's occurring on some of them - they have multiple websites covering the same information (local businesses) - but others I'm completely flummoxed. Does anyone have any experience figuring this one out?
Reporting & Analytics | | brettmandoes2 -
Question on structuring URLs in a Drupal CMS - Adverse SEO or Analytics impacts?
Hello Moz Community, We're building out a health system (think a bunch of hospitals and clinics etc.) website on Drupal for the first time. Nebraskamed.com is our domain. Because we're using nodes instead of pages, our URL structure can pretty much be whatever we think makes sense. Our proposal is to drop /blog/ and related terms from the URL structure, because it doesn't really mean anything to the user. Instead, we'd use the service line "cancer" for example, followed by the name of the blog post or document. Example: nebraskamed.com/cancer/10-bone-cancer-myths Do you see any red flags (perhaps with SEO or Analytics for example) to what I'm proposing? domain name/service line/blog-post-name If so, do you have a URL structure you advise?
Reporting & Analytics | | Patrick_at_Nebraska_Medicine1 -
How can I make sure that we are only tracking for single URLs?
Is there a way to track in Google analytics where part of the URL is excluded. For example, we need to track when customers complete an application form, however whenever a new form is completed a new URL is created. This makes it difficult to track pages in GA as there are so many URLs.
Reporting & Analytics | | Sable_Group0 -
Redirect 301
Hello, Case study:
Reporting & Analytics | | Shanaki
I changed all the links from an online store as follows: Before: www.domain.com/store1/category/product (link to a product, in magento e-commerce)
After: www.domain.com/category/product (link to a product). Basically I removed the stores, and made 1 default / base store, and put a 301 redirect (from .htacces). My question is:
1. How bad i broke the seo, cause i was no.1 on 30 keywords with these structure of links, and these kinds of link had page rank 2-3.
2. Does redirect 301 from old link, transfer the Page rank?
3. I should modify all link from link building to the new link? I mean urgent?
4. Does google will reindex the keywords with the new links? Thank you in advance.
With respect,
Shanaki0 -
Switch to www from non www preference negatively hit # pages indexed
I have a client whose site did not use the www preference but rather the non www form of the url. We were having trouble seeing some high quality inlinks and I wondered if the redirect to the non www site from the links was making it hard for us to track. After some reading, it seemed we should be using the www version for better SEO anyway so I made a change on Monday but had a major hit to the number of pages being indexed by Thursday. Freaking me out mildly. What are people's thoughts? I think I should roll back the www change asap - or am I jumping the gun?
Reporting & Analytics | | BrigitteMN0 -
URL Parameters
Hi there, I have a magento sort by feature which has indexed loads of pages in Google with urls that have /shopby/ in them.Over 8k pages have been indexed like this. I cannot edit the robots within the page but have now disallowed the urls in robots.txt - i guess this will prevent new ones being indexed but not deindex current ones? So I looked into URL parameters, I added 'shopby' as a parameter in webmaster tools and told Google not to crawl any urls with this in it, will this deindex the pages already indexed? The only other way seems to be manually removing 8k urls, which i do not want to do. Any advice much appreciated. Obviously I do not want these urls indexed as they are weak/duplicate sort by search pages, I fear the panda update would not be too kind on it long term?
Reporting & Analytics | | tdigital0 -
The brainstorm of finding the reason of the URL decrease on original search result and the procedures of fixing the problem?
Hi guys: i just any one have some idea of how to find the mainly reasons of the listed position on google search original result decrease and the procedures of fixing those problem. Appreciate for any feedback. David
Reporting & Analytics | | skyten0