Index.php canonical/dup issues
-
Hello my fellow SEOs!
I would LOVE some additional insight/opinions on the following...
I have a client who is an industry leader, big site, ranks for many competitive phrases, blah blah..you get the picture.
However, they have a big dup content/canonical issue. Most pages resolve with and without the /index.php at the end of the URL. Obviously this is a dup content issue but more importantly they SEs sometimes serve an "index.php" version of the page, sometimes they don't, and it is constantly changing which version it serves and the rank goes up and down.
Now, I've instructed them that we are going to need to write a sitewide redirect to attempt a uniform structure. Most people would say, redirect to the non index.php version buttttt
1. The index.php pages consistently outperforms the non index.php versions, except the homepage.
2. The client really would prefer to have the "index.php" at the end of the URL
The homepage performs extremely well for a lot of competitive phrases. I'd like to redirect all pages to the "index.php" version except the homepage and I'm thinking that if I redirect all pages EXCEPT the homepage to the index.php version, it could cause some unforeseen issues.
I can not use rel=canonical because they have many different versions of the their pages with different country codes in the URL..example, if I make the US version canonical, it will hurt the pages trying to rank with a fr URL, de URL, (where fr/de are country codes in the URL depending where the user is, it serves the correct version).
Any advice would be GREATLY appreciated. Thanks in advance!
Mike
-
Have you checked the backlinks? The only logical reason I can think of for the index.php versions of the URL to outperform the friendly versions is more sites have linked to them.
I would make every effort to convince the client to use friendly URLs. Users clearly prefer them and technologies change. Even if they are using .php today, in a couple years it may be a dead technology and they will have to redirect their entire site. It's not a logical business move.
With the above noted, if you wish to perform the redirect of all pages except the home page to the index.php form of the URL, it is doable with the proper regex expression. The issues I foresee have already been shared:
-
URLs are harder to read by users and are therefore less friendly
-
URLs are longer so therefore more difficult to share naturally in tweets (for example) without a URL shortening service
-
URLs include "php" so when the site's technology changes the URLs will need to be redirected
-
Users may experience confusion related to the inconsistent URL formats of the home page and the rest of the site
-
Long URLs are cut off. You mentioned using other languages. If a page's title involves foreign characters, those characters are converted in the URL to ?unicode. It is where you see characters like "%20" replace a single character. With foreign URLs the length can often exceed maximums which is an issue. Keeping index.php is an extra 9 characters added to every URL.
This decision approaches the SEO equivalent of a patient going against their doctor's advice. If it was my client, I would want a very firm acknowledgment this decision was against my advice and industry best practices.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate pages and Canonicals
Hi all, Our website has more than 30 pages which are duplicates. So canonicals have been deployed to show up only 10 of these pages. Do more of these pages impact rankings? Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Sitemap indexing
Hi everyone, Here's a duplicate content challenge I'm facing: Let's assume that we sell brown, blue, white and black 'Nike Shoes model 2017'. Because of technical reasons, we really need four urls to properly show these variations on our website. We find substantial search volume on 'Nike Shoes model 2017', but none on any of the color variants. Would it be theoretically possible to show page A, B, C and D on the website and: Give each page a canonical to page X, which is the 'default' page that we want to rank in Google (a product page that has a color selector) but is not directly linked from the site Mention page X in the sitemap.xml. (And not A, B, C or D). So the 'clean' urls get indexed and the color variations do not? In other words: Is it possible to rank a page that is only discovered via sitemap and canonicals?
Intermediate & Advanced SEO | | Adriaan.Multiply1 -
Index or not index Categories
We are using Yoast Seo plugin. On the main menu we have only categories which has consist of posts and one page. We have category with villas, category with villa hotels etc. Initially we set to index and include in the sitemap posts and excluded categories, but I guess it was not correct. Would be a better way to index and include categories in the sitemap and exclude the posts in order to avoid the duplicate? It somehow does not make sense for me, If the posts are excluded and the categories included, will not then be the categories empty for google? I guess I will get crazy of this. Somebody has perhaps more experiences with this?
Intermediate & Advanced SEO | | Rebeca10 -
Google Indexed my Site then De-indexed a Week After
Hi there, I'm working on getting a large e-commerce website indexed and I am having a lot of trouble.
Intermediate & Advanced SEO | | Travis-W
The site is www.consumerbase.com. We have about 130,000 pages and only 25,000 are getting indexed. I use multiple sitemaps so I can tell which product pages are indexed, and we need our "Mailing List" pages the most - http://www.consumerbase.com/mailing-lists/cigar-smoking-enthusiasts-mailing-list.html I submitted a sitemap a few weeks ago of a particular type of product page and about 40k/43k of the pages were indexed - GREAT! A week ago Google de-indexed almost all of those new pages. Check out this image, it kind of boggles my mind and makes me sad. http://screencast.com/t/GivYGYRrOV While these pages were indexed, we immediately received a ton of traffic to them - making me think Google liked them. I think our breadcrumbs, site structure, and "customers who viewed this product also viewed" links would make the site extremely crawl-able. What gives?
Does it come down to our site not having enough Domain Authority?
My client really needs an answer about how we are going to get these pages indexed.0 -
Rel canonical and duplicate subdomains
Hi, I'm working with a site that has multiple sub domains of entirely duplicate content. So, the production level site that visitors see is (for made-up illustrative example): 123abc456.edu Then, there are sub domains which are used by different developers to work on their own changes to the production site, before those changes are pushed to production: Larry.123abc456.edu Moe.123abc456.edu Curly.123abc456.edu Google ends up indexing these duplicate sub domains, which is of course not good. If we add a canonical tag to the head section of the production page (and therefor all of the duplicate sub domains) will that cause some kind of problem... having a canonical tag on a page pointing to itself? Is it okay to have a canonical tag on a page pointing to that same page? To complete the example... In this example, where our production page is 123abc456.edu, our canonical tag on all pages (this page and therefor the duplicate subdomains) would be: Is that going to be okay and fix this without causing some new problem of a canonical tag pointing to the page it's on? Thanks!
Intermediate & Advanced SEO | | 945010 -
Duplicate Content Issue
Why do URL with .html or index.php at the end are annoying to the search engine? I heard it can create some duplicate content but I have no idea why? Could someone explain me why is that so? Thank you
Intermediate & Advanced SEO | | Ideas-Money-Art0 -
Ranking with other pages not index
The site ranks on page 4-5 with other page like privacy, about us, term pages. I encounter this problem allot in the last weeks; this usually occurs after the page sits 1-2 months on page 1 for the terms. I'm thinking of to much use the same anchor as a primary issue. The sites in questions are 1-5 pages microniche sites. Any suggestions is appreciated. Thank You
Intermediate & Advanced SEO | | m3fan0 -
Static index page or not?
Are there any advantages of dis-advantages to running a static homepage as opposed to a blog style homepage. I have be running a static page on my site with the latest posts displayed as links after the homepage content. I would like to remove the static page and move to a more visually appealing homepage that includes graphics for each post and the posts droppping down the page like normal blogs do. How will this effect my site if I move from a static page to a more dynamic blog style page layout? Could I still hold the spot I currently rank for with the optimized index content if I turn to a more traditional blog format? cheers,
Intermediate & Advanced SEO | | NoCoGuru0