Index.php canonical/dup issues
-
Hello my fellow SEOs!
I would LOVE some additional insight/opinions on the following...
I have a client who is an industry leader, big site, ranks for many competitive phrases, blah blah..you get the picture.
However, they have a big dup content/canonical issue. Most pages resolve with and without the /index.php at the end of the URL. Obviously this is a dup content issue but more importantly they SEs sometimes serve an "index.php" version of the page, sometimes they don't, and it is constantly changing which version it serves and the rank goes up and down.
Now, I've instructed them that we are going to need to write a sitewide redirect to attempt a uniform structure. Most people would say, redirect to the non index.php version buttttt
1. The index.php pages consistently outperforms the non index.php versions, except the homepage.
2. The client really would prefer to have the "index.php" at the end of the URL
The homepage performs extremely well for a lot of competitive phrases. I'd like to redirect all pages to the "index.php" version except the homepage and I'm thinking that if I redirect all pages EXCEPT the homepage to the index.php version, it could cause some unforeseen issues.
I can not use rel=canonical because they have many different versions of the their pages with different country codes in the URL..example, if I make the US version canonical, it will hurt the pages trying to rank with a fr URL, de URL, (where fr/de are country codes in the URL depending where the user is, it serves the correct version).
Any advice would be GREATLY appreciated. Thanks in advance!
Mike
-
Have you checked the backlinks? The only logical reason I can think of for the index.php versions of the URL to outperform the friendly versions is more sites have linked to them.
I would make every effort to convince the client to use friendly URLs. Users clearly prefer them and technologies change. Even if they are using .php today, in a couple years it may be a dead technology and they will have to redirect their entire site. It's not a logical business move.
With the above noted, if you wish to perform the redirect of all pages except the home page to the index.php form of the URL, it is doable with the proper regex expression. The issues I foresee have already been shared:
-
URLs are harder to read by users and are therefore less friendly
-
URLs are longer so therefore more difficult to share naturally in tweets (for example) without a URL shortening service
-
URLs include "php" so when the site's technology changes the URLs will need to be redirected
-
Users may experience confusion related to the inconsistent URL formats of the home page and the rest of the site
-
Long URLs are cut off. You mentioned using other languages. If a page's title involves foreign characters, those characters are converted in the URL to ?unicode. It is where you see characters like "%20" replace a single character. With foreign URLs the length can often exceed maximums which is an issue. Keeping index.php is an extra 9 characters added to every URL.
This decision approaches the SEO equivalent of a patient going against their doctor's advice. If it was my client, I would want a very firm acknowledgment this decision was against my advice and industry best practices.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index, follow on a paginated page with a different rel=canonical URL
Hello, I have a question about meta robots ="index, follow" and rel=canonical on category page pagination. Should the sorted page be <meta name="robots" content="index,follow"></meta name="robots" content="index,follow"> since the rel="canonical" is pointing to a separate page that is different from the URL? Any thoughts on this topic would be awesome. Thanks. Main Category Page
Intermediate & Advanced SEO | | Choice
https://www.site.com/category/
<meta name="robots" content="index,follow"><link rel="canonical" href="https: www.site.com="" category="" "=""></link rel="canonical" href="https:></meta name="robots" content="index,follow"> Sorted Page
https://www.site.com/category/?p=2&dir=asc&order=name
<meta name="robots" content="index, follow"=""><link rel="canonical" href="https: www.site.com="" category="" ?p="2""></link rel="canonical" href="https:></meta name="robots" content="index,> As you can see, the meta robots is telling Google to index https://www.site.com/category/?p=2&dir=asc&order=name , yet saying the canonical page is https://www.site.com/category/?p=2 .0 -
Specific page does not index
Hi, First question: Working on the indexation of all pages for a specific client, there's one page that refuses to index. Google Search console says there's a robots.txt file, but I can't seem to find any tracks of that in the backend, nor in the code itself. Could someone reach out to me and tell me why this is happening? The page: https://www.brody.be/nl/assistentiewoningen/ Second question: Google is showing another meta description than the one our client gave in in Yoast Premium snippet. Could it be there's another plugin overwriting this description? Or do we have to wait for it to change after a specific period of time? Hope you guys can help
Intermediate & Advanced SEO | | conversal0 -
Should I Keep adding 301s or use a noindex,follow/canonical or a 404 in this situation?
Hi Mozzers, I feel I am facing a double edge sword situation. I am in the process of migrating 4 domains into one. I am in the process of creating URL redirect mapping The pages I am having the most issues are the event pages that are past due but carry some value as they generally have one external followed link. www.example.com/event-2008 301 redirect to www.newdomain.com/event-2016 www.example.com/event-2007 301 redirect to www.newdomain.com/event-2016 www.example.com/event-2006 301 redirect to www.newdomain.com/event-2016 Again these old events aren't necessarily important in terms of link equity but do carry some and at the same time keep adding multiple 301s pointing to the same page may not be a good ideas as it will increase the page speed load time which will affect the new site's performance. If i add a 404 I will lose the bit of equity in those. No index,follow may work since it won't index the old domain nor the page itself but still not 100% sure about it. I am not sure how a canonical would work since it would keep the old domain live. At this point I am not sure which direction I should follow? Thanks for your answers!
Intermediate & Advanced SEO | | Ideas-Money-Art0 -
Canonicals question ref canonicals pointing to redundant urls
Hi, SCENARIO: A site has say 3 examples of the same product page but with different urls because that product fits into 3 different categories e.g. /tools/hammer /handtools/hammer /specialoffers/hammer and lets say the first 2 of those have the canonical pointing to /specialoffers/hammer YET that page is now redundant e.g. the webmaster decided to do away with the /specialoffers/ folder. ASSUMPTIONS: That is going to seriously hamper the chances of the 2 remaining versions of the hammer page being able to rank as they have canonicals pointing to a url that no longer exists. The canonical tags should be changed to point to 1 of the remaining url versions. As an added complication - lets say /specialoffers/hammer still exists, the url works, but just isn't navigable from the site. Thoughts/feedback welcome!
Intermediate & Advanced SEO | | AndyMacLean0 -
Client has moved to secured https webpages but non secured http pages are still being indexed in Google. Is this an issue
We are currently working with a client that relaunched their website two months ago to have hypertext transfer protocol secure pages (https) across their entire site architecture. The problem is that their non secure (http) pages are still accessible and being indexed in Google. Here are our concerns: 1. Are co-existing non secure and secure webpages (http and https) considered duplicate content?
Intermediate & Advanced SEO | | VanguardCommunications
2. If these pages are duplicate content should we use 301 redirects or rel canonicals?
3. If we go with rel canonicals, is it okay for a non secure page to have rel canonical to the secure version? Thanks for the advice.0 -
Help with https// redirects
Hey there
Intermediate & Advanced SEO | | Jay328
I have a client who just moved from a self hosted CMS to Adobe Catalyst (don't ask!)
The problem: Their url indexed with google is https://domain.com, Adobe Catalyst does not support third party SSL certificates or https domains. Now when people google them https://domain.com shows up in search, HOWEVER it does not have a trusted certificate and a pop up window blocks the site. They are a mortgage company so SSL is really not needed. What can I do to get google to recognize the site at http: vs. https? Would this be something in GWMT? Thanks!0 -
Rel=next/prev for paginated pages then no need for "no index, follow"?
I have a real estate website and use rel=next/prev for paginated real estate result pages. I understand "no index, follow" is not needed for the paginated pages. However, my case is a bit unique: this is real estate site where the listings also show on competitors sites. So, I thought, if I "no index, follow" the paginated pages that would reduce the amount of duplicate content on my site and ultimately support my site ranking well. Again, I understand "no index, follow" is not needed for paginated pages when using rel=next/prev, but since my content will probably be considered fairly duplicate, I question if I should do anyway.
Intermediate & Advanced SEO | | khi50 -
How do you de-index and prevent indexation of a whole domain?
I have parts of an online portal displaying in SERPs which it definitely shouldn't be. It's due to thoughtless developers but I need to have the whole portal's domain de-indexed and prevented from future indexing. I'm not too tech savvy but how is this achieved? No index? Robots? thanks
Intermediate & Advanced SEO | | Martin_S0