How to find original URLS after Hosting Company added canonical URLs, URL rewrites and duplicate content.
-
We recently changed hosting companies for our ecommerce website. The hosting company added some functionality such that duplicate content and/or mirrored pages appear in the search engines.
To fix this problem, the hosting company created both canonical URLs and URL rewrites. Now, we have page A (which is the original page with all the link juice) and page B (which is the new page with no link juice or SEO value). Both pages have the same content, with different URLs.
I understand that a canonical URL is the way to tell the search engines which page is the preferred page in cases of duplicate content and mirrored pages. I also understand that canonical URLs tell the search engine that page B is a copy of page A, but page A is the preferred page to index.
The problem we now face is that the hosting company made page A a copy of page B, rather than the other way around. But page A is the original page with the seo value and link juice, while page B is the new page with no value. As a result, the search engines are now prioritizing the newly created page over the original one.
I believe the solution is to reverse this and make it so that page B (the new page) is a copy of page A (the original page). Now, I would simply need to put the original URL as the canonical URL for the duplicate pages. The problem is, with all the rewrites and changes in functionality, I no longer know which URLs have the backlinks that are creating this SEO value.
I figure if I can find the back links to the original page, then I can find out the original web address of the original pages.
My question is, how can I search for back links on the web in such a way that I can figure out the URL that all of these back links are pointing to in order to make that URL the canonical URL for all the new, duplicate pages.
-
My first question back at you is why is your hosting company deciding your canonical structure for you? a hosting company should just host and you should be using canonical tags and .htaccess to sort out your own canonical URLs. So my gut reaction is to change hosting company.
After that:
- Open Site Explorer - put both URLs in and see which ones are being linked to and which has the highest value.
If you're not a PRO member then take the 30 day trial and use OSE to sort this issue out. You'll be glad you did.
After that... seriously as a web designer & SEO, not having control of my own URLs would annoy the heck out of me, so change hosting provider. That'll be the second biggest favour you could give yourself
Slightly opinionated, I know, but at the end of the day it's your website not your host's.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
To what extent is content considered unique or duplicate?
I work primarily on classifieds websites and an issue I consistently come across are two or more URLs which have the exact same ad count, due to site structure and the way everything is categorized. An example of such would be with these two pages: https://www.boatshop24.co.uk/motorboats/princess https://www.boatshop24.co.uk/boats-for-sale/princess/power These two have the exact same number of ads- would search engines mark these as duplicate content? Both have different meta descriptions, title tags etc. but essentially the MC is exactly the same. If they are, what would be the best course to remedy the problem? I'm skeptical about using canonical tags as I generally use them for exact duplicate pages.
Technical SEO | | Sayers0 -
Canonical sitemap URL different to website URL architecture
Hi, This may or may not be be an issue, but would like some SEO advice from someone who has a deeper understanding. I'm currently working on a clients site that has a bespoke CMS built by another development agency. The website currently has a sitemap with one link - EG: www.example.com/category/page. This is obviously the page that is indexed in search engines. However the website structure uses www.example.com/page, this isn't indexed in search engines as the links are canonical. The client is also using the second URL structure in all it's off and online advertising, internal links and it's also been picked up by referral sites. I suspect this is not good practice... however I'd like to understand whether there are any negative SEO effectives from this structure? Does Google look at both pages with regard to visits, pageviews, bounce rate, etc. and combine the data OR just use the indexed version? www.example.com/category/page - 63.5% of total pageviews
Technical SEO | | MikeSutcliffe
www.example.com/page - 34.31% of total pageviews Thanks
Mike0 -
Duplicate Content Mystery
Hi Moz community! I have an ongoing duplicate mystery going on here and I'm hoping someone here can answer my question. We have an Ecommerce site that has a variety of product pages and category pages. There are Rel canonicals in place, along with parameters in GWT, and there are also URL rewrites. Here are some scenarios, maybe you can give insight as to what’s exactly going on and how to fix it. All the duplicates look to be coming from category pages specifically. For example:
Technical SEO | | Ecom-Team-Access
This link re-writes: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html?cat=407&color=152&price=20- To: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html The rel canonical tag looks like this: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html" /> The CONTENT is different, but the URLs are the same. It thinks that the product category view is the same as the all products view, even though there is a canonical in there telling it which one is the original. Some of them don’t have anything to do with each other. Take a look: Link identified as duplicate: http://www.incipio.com/cases/smartphone-cases/htc-smartphone-cases/htc-windows-phone-8x-cases.html?color=27&price=20- Link this is a duplicate of: http://www.incipio.com/cases/macbook-cases/macbook-pro-13in-cases.html Any idea as to what could be happening here?0 -
Canonical URLs on location based offers
hello world. i offer first aid courses in different locations in switzerland. now i'm not sure if i have to make the single registration pages rel="canonical" or not. example:
Technical SEO | | alekaj
location 1 -> course list @ Nothelferkurse Thun
location 2 -> course list @ Nothelferkurse Bern the content is almost the same. how do i have to handle with it? thanks for your help!2 -
Duplicate content problem
Hi, i work in joomla and my site is www.in2town.co.uk I have been looking at moz tools and it is showing i have over 600 pages of duplicate content. The problem is shown below and i am not sure how to solve this, any help would be great, | Benidorm News http://www.in2town.co.uk/benidorm-news/Page-2 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-102 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-103 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-104 9 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-106 28 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-11 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-112 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-114 45 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-115 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-116 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-12 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-120 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-123 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-13 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-130 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-131 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-132 31 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-140 4 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-141 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-21 10 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-22 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-23 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-26 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-271 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-274 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-277 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-28 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-29 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-310 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-341 21 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-342 4 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-343 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-345 1 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-346 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-348 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-349 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-350 50 16 0 In2town http://www.in2town.co.uk/blog/In2town/Page-351 50 19 1 In2town http://www.in2town.co.uk/blog/In2town/Page-82 24 1 0 In2town http://www.in2town.co.uk/blog/in2town 50 20 1 In2town http://www.in2town.co.uk/blog/in2town/Page-10 50 23 3 In2town http://www.in2town.co.uk/blog/in2town/Page-100 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-101 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-105 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-107 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-108 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-109 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-110 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-111 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-113 |
Technical SEO | | ClaireH-1848860 -
Duplicated content in moz report due to Magento urls in a multiple language store.
Hi guys, Moz crawl is reporting as duplicated content the following urls in our store: http://footdistrict.com and http://footdistrict.com?___store=footdistrict_es The chain: ___store=footdistrict_es is added as you switch the language of the site. Both pages have the http://footdistrict.com" /> , but this was introduced some time after going live. I was wondering the best action to take considering the SEO side effects. For example: Permanent redirect from http://footdistrict.com?___store=footdistrict_es to http://footdistrict.com. -> Problem: If I'm surfing through english version and I switch to spanish, apache will realize that http://footdistrict.com?___store=footdistrict_es is going to be loaded and automatically it will redirect you to http:/footdistrict.com. So you will stay in spanish version for ever. Deleting the URLS with the store code from Google Web Admin tools. Problem: What about the juice? Adding those URL's to robots.txt. Problem: What about the juice? more options? Basically I'm trying to understand the best option to avoid these pages being indexed. Could you help here? Thanks a lot.
Technical SEO | | footd0 -
Duplicate Content Issue
Very strange issue I noticed today. In my SEOMoz Campaigns I noticed thousands of Warnings and Errors! I noticed that any page on my website ending in .php can be duplicated by adding anything you want to the end of the url, which seems to be causing these issues. Ex: Normal URL - www.example.com/testing.php Duplicate URL - www.example.com/testing.php/helloworld The duplicate URL displays the page without the images, but all the text and information is present, duplicating the Normal page. I Also found that many of my PDFs seemed to be getting duplicated burried in directories after directories, which I never ever put in place. Ex: www.example.com/catalog/pdfs/testing.pdf/pdfs/another.pdf/pdfs/more.pdfs/pdfs/ ... when the pdfs are only located in a pdfs directory! I am very confused on how to fix this problem. Maybe with some sort of redirect?
Technical SEO | | hfranz0 -
CGI Parameters: should we worry about duplicate content?
Hi, My question is directed to CGI Parameters. I was able to dig up a bit of content on this but I want to make sure I understand the concept of CGI parameters and how they can affect indexing pages. Here are two pages: No CGI parameter appended to end of the URL: http://www.nytimes.com/2011/04/13/world/asia/13japan.html CGI parameter appended to the end of the URL: http://www.nytimes.com/2011/04/13/world/asia/13japan.html?pagewanted=2&ref=homepage&src=mv Questions: Can we safely say that CGI parameters = URL parameters that append to the end of a URL? Or are they different? And given that you have rel canonical implemented correctly on your pages, search engines will move ahead and index only the URL that is specified in that tag? Thanks in advance for giving your insights. Look forward to your response. Best regards, Jackson
Technical SEO | | jackson_lo0