Can anyone explain why and how these odd URLs could be working?
-
In our GWT and Google Analytics traffic reports, I often see some very oddly formed URLs. Here's an example
http://www.ccisolutions.com/storefront/www.ccisolutions.com
and here's another
http://www.ccisolutions.com/StoreFront/category//www.ccisolutions.com/StoreFront/CEW.catWhat strikes me about this particular URL is two things:
- It renders this page http://www.ccisolutions.com/StoreFront/category/on-disc-printing, but not with that URL, the URL stays http://www.ccisolutions.com/StoreFront/category//www.ccisolutions.com/StoreFront/CEW.cat
- When I break this URL into pieces
http://www.ccisolutions.com/StoreFront/category/CEW.cat
and www.ccisolutions.com/StoreFront/CEW.cat,
both redirect to: http://www.ccisolutions.com/StoreFront/category/on-disc-printingThis makes me wonder, is there something (a rule?) in the
backend (maybe the .htaccess file?)that was set up that sayshttp://www.ccisolutions.com/StoreFront/category/CEW.cat
= www.ccisolutions.com/StoreFront/CEW.cat
(or maybe vice versa?), and as a result an odd URL for the page is being
written automatically?This scenario worked on every category page I checked. All had the same results. For example, I tried:
http://www.ccisolutions.com/StoreFront/category//www.ccisolutions.com/StoreFront/AAA.cat
and it rendered the Live Sound category page, but without redirecting to the
user friendly URL. This URL stayed unchanged in the address barWhen I broke it into pieces, like
http://www.ccisolutions.com/StoreFront/category/AAA.cat
and www.ccisolutions.com/StoreFront/AAA.cat, both of these redirected to http://www.ccisolutions.com/StoreFront/category/sound-video-lighting-equipment-expertsHave any of you ever encountered a problem like this? Any sugeestions as to what might be causing it and how to remedy the problem? It is definitely causing us a duplicate content headache. Thanks!
Dana
-
Thanks George! Fantastic detail and I think between your suggestions and Ben's too we are going to get further to solving this than we've ever gotten before. Perhaps we'll even solve this. That would be so great. As I mentioned, the company identified this problem 4 years before they hired me, and it's never been solved. I feel like part of why I am there as there SEO strategist is to pound away at these problems until they're fixed.
Thanks so much to you both. I can't wait to go in on Monday morning and use these suggestions to solve a five year old problem! Awesome.
I'll let you know what happens. If we fix it, I owe you and Ben dinner! (at the very least)
-
Thanks Ben. No apology necessary, it's all good. Your suggestion in combination with George's could lead us to an answer. This is definitely going to get us closer to finding the problem than we've ever gotten before. The company has been aware of this problem for almost 5 years but hasn't ever identified how to fix it. I've only been there a year now and I'm on the warpath to fix these technical issues. There are so many of them causing duplicate content problems that any SEO I do is undermined by problems like these.
I really really appreciate your reply and suggestions!
Dana
-
I'm not sure what CMS you are using, but I've seen this before with Joomla when setting the SEO Settings in the Global Configuration section of the Administration panel. Specifically, when working with the Apache mod_rewrite setting; which is related to the .htaccess question you had.
There are a number of things wrong with the way some CMS's have set up their redirects and how they present content. You may end up playing with each combination to fix your issue (depending on how you want to fix it).
If I were looking into this, I would do the following:
- I would determine if I was using Joomla. If so, check your configuration.php file and see if you have your domain name provided in the property for "live_site". If you do, try changing this from 'www.ccisolutions.com' (or whatever is there) to the empty string '' (aka just two single quotes).
- If you are not using Joomla, see if there is a configuration file for the CMS you are using and look for something similar to the above.
- If there is not a configuration setting that is providing for this "duplication" of domain name, look at the .htaccess file itself to see why it redirects when you break the URL up, but not when it has a second domain string in the URL (e.g. the second "www.ccisolutions.com").
- Then look at the code for the CMS and see how it interprets your URLs. To me it looks like you are using some sort of MVC framework which is taking each piece of the URL and translating it into variables to determine what content to show (REST-like). When it is parsing the URL, it seems to be looking for the end of the domain name and then taking anything off the end to translate into content.
However you figure out the issue, I suggest looking at how your CMS is actually producing the canonical tag. Right now this URL (http://www.ccisolutions.com/StoreFront/category/www.ccisolutions.com/StoreFront/CEW.cat) is using the following canonical:
rel="canonical" href="on-disc-printing"/>
I don't think that is what you are looking for in your canonical tags.
I hope this helps and answers your questions.
-
Hi Dana,
I wrote the following after assuming , for no reason at all, that you didn't know much about SEO. However having looked at your profile I realized that I was wrong and that my tone is probably a little patronizing. That being said it's 1am over here and I really don't want to rewrite it so please accept my apologies.
If I had to guess (and it is a guess as I'm not technical) I would say it was some badly formed links.
You know how some of your error pages have an Origin parameter (like this one) that say where the page was generated? Well these URLs follow the same format as the error pages that you're finding. It looks like rather than using an absolute link (like http://www.ccisolutions.com/page) the onclick action is actually generating a relative link (so just /page).
When you use a relative link your site adds the partial URL (/page) onto the end of your domain to give you a full URL (http://www.ccisolutions.com + /page = http://www.ccisolutions.com/page). It looks like you're using relative links as if they were static ones. Which is why you have "www.ccisolutions" in each URL twice.
If I had to blame anything it would be whatever is powering your IAFDispatcher however as I haven't been able to replicate your problem I couldn't be certain. If you can track how these URLs were generated by looking at the preceding pages that are sending traffic/bots to them then you should be able to narrow it down to which links are broken.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to delete specific url?
I just ran drawl diagnostics and trying to delete pages such as "oops that page can't be found" or "404 (not found_ error response pages. Can anyone help?
Technical SEO | | sawedding0 -
Why can no tool crawl this site?
I am trying to perform a crawl analysis on a client's website at https://www.bravosolution.com I have tried to crawl it with IIS for SEO, Sreaming Frog and Xenu and not one of them makes it further than the home page of the site. There is nothing I can see in the robots.txt that is blocking these agents. As far as I can see, Google is able to crawl the site although they have noticed a significant drop in organic traffic. Any advise would be very welcome Regards Danny
Technical SEO | | richdan0 -
What can I do if my reconsideration request is rejected?
Last week I received an unnatural link warning from Google. Sad times. I followed the guidelines and reviewed all my inbound links for the last 3 months. All 5000 of them! Along with several genuine ones from trusted sites like BBC, Guardian and Telegraph there was a load of spam. About 2800 of them were junk. As we don't employ any SEO agency and don't buy links (we don't even buy adwords!) I know that all of this spam is generated by spam bots and site scrapers copying our content. As the bad links have not been created by us and there are 2800 of them I cannot hope to get them removed. There are no 'contact us' pages on these Russian spam directories and Indian scraper sites. And as for the 'adult book marking website' who have linked to us over 1000 times, well I couldn't even contact that site in company time if I wanted to! As a result i did my manual review all day, made a list of 2800 bad links and disavowed them. I followed this up with a reconsideration request to tell Google what I'd done but a week later this has been rejected "We've reviewed your site and we still see links to your site that violate our quality guidelines." As these links are beyond my control and I've tried to disavow them is there anything more to be done? Cheers Steve
Technical SEO | | SteveBrumpton0 -
Would these be considered dynamic URLs?
Hi, I have a (brand) new client (outdoor recreation), and it links to many different lodges. It's built in Wordpress (Pagelines), and the partner page link URLs. Although they do have the "?" in there, it's only has a single parameter. http://www.clientsite/?partners=partner-name Google is indexing the URLs, I do plan to increase the amount of content/on-page for each. Yet, weighing the risk/reward of rewriting all of these URLs.
Technical SEO | | csmithal0 -
Can anyone tell me why the bot has only picked up one page?
www.namebadgesinternational.co.nz After the 2nd week, I changed the robots.txt file to allow ALL robots on the website, but it still hasn't gone through any pages after another crawl Any help would be hugely appreciated.
Technical SEO | | designsecrets0 -
How can I add an additional user
I want to give access to this account to an additional user. Is this possible?
Technical SEO | | MishconAdmin0 -
My client has lost his URL - is there anything he can do to salvage SEO?
My new client has had his URL for 8 years and built up good SEO, visitors and links. He has now lost it and the cost of getting it back is prohibitive. Apart from contacting all the places he is currently getting links from, is there anything he can do to salvage SEO and site visitors? Is there anyway he can get 301s done if he no longer owns the URL? If he starts again with a new URL, and loads all the new content on it, will submitting a site map help Google understand its not duplicate and all the content is just at a new URL? He is hoping that contacting Google and explaining will help them "look kindly", but I have never heard anything like this happening! Any ideas? Many thanks
Technical SEO | | Chammy0 -
Singular vs plural in urls
In keyword research for an ecommerce site, I've found that widget, singular gets a lot more searches than widgets, plural AND is much less competitive. Is it better for SEO purposes to have the URLs (and matching title tags) in the catalog as /brass-widget.html, /steel-widget.html, etc., or /brass-widgets.html, etc.? I'm worried that a) searches for widgets will pass by the singular urls but not vice versa, and b) the singular form will strike visitors as bad grammar. Any advice?
Technical SEO | | AmericanOutlets0