Finding the source of duplicate content URL's
-
We have a website that displays a number of products. The product has variations (sizes) and unfortunately every size has its own URL (for now anyway). Needless to say, this causes duplicate content issues. (And of course, we are looking to change the URL's for our site as soon as possible)
However, even though these duplicate URL's exist, you should not be able to land on them by navigating through the site. In theory, the site should always display the link to the smallest size. It seems that there is a flaw in our system somewhere, as these links are now found in our campaign here on SEOmoz.
My question: is there any way to find the crawl path that lead to the URL's that shouldn't have been found, so we can locate the problem?
-
Using the Screaming Frog SEO Spider (free version to download will crawl 500 URLs, paid version [99 GBP for a yearly license] will crawl as much as you want), you can see all of the inlinks to a particular page. So run a crawl of the site, you should find those pages with Screaming Frog, and then you can view the inlinks to those pages. Visit the inlinks, and check the code for the links to the page you're looking for - this will quickly show you where the links are to the pages you're trying to hide.
Also, have you checked the sitemap - the CMS might create links to these pages in the sitemap.
good luck and let me know if you need any more help with this.
Mark
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How are you using Moz Content?
Hey people, I just subscribed for Moz Content and I am wondering how are other professionals using it as a strategy tool. Today I just released a blog post talking about how larger content impacts PA. Not a big deal. I would appreciate some ideas and insights.
Moz Pro | | amirfariabr0 -
Are there tools to discover duplicate content issues with the other websites?
We have issues with users copy-pasting content from other sources into our site. The only way I know to find out, is to manually (!!) copy a snippet of their text into google, to see if I get results from other sites. I have been googling for tools to help automate this process, but without luck. Can you recommend any?
Moz Pro | | betternow0 -
I am trying to find inbound links for one of my site urls. My question is does SEOMoz able to track all internal links as the Open Site Explorer shows 0 internal links?
It shows 0 internal links when I am pretty sure we have multiple internal links.Should we use absolute urls or relative urls for internal links?
Moz Pro | | SulekhaUSLLC0 -
Duplicate page content on / and index.php
Hi I am new to SEOmoz and in the crawl diagnostics for one of my clients it came back duplicate content on the homepage www.myclient.co.uk and on the www.myclient.co.uk/index.php which is obviously the same page. I understand that the key is to do a 301 redirect from the index to /, however how will I know that this will not just create an ever ending loop on the server? From your experience how is the best way to tackle this crawl error? Also is there a specific question that I need to ask the server?
Moz Pro | | search_shop0 -
Crawl Diagnostics Shows thousands of 302's from a single url. I'm confused
Hi guys I just ran my first campaign and the crawl diagnostics are showing some results I'm unfamiliar with.. In the warnings section it shows 2,838 redirects.. this is where I want to focus. When I click here it shows 5 redirects per page. When I go to click on page 2, or next page, or any other page than page 1 for that matter... this is where things get confusing. Nothing shows. Downloading the csv reveals that 2,834 of these are all showing: URL: http://www.mydomain.com/401/login.php url: http://www.mydomain.com/401/login.php referrer: http://www.mydomain.com/401/login.php location_header: http://www.mydomain.com/401/login.php I guess I'm just looking for an explanation as to why it's showing so many to the same page and what possible actions can be taken on my part to correct it (if needed). Thanks in advance
Moz Pro | | sethwb0 -
Canonical URLs for Search Parameters
Hi Guys Our seomoz campaign report is returning a lot or Rel Canonical issues similar to this for each page. The non / version redirects to the / version but how do I get the ones with search parameters ie '?datefrom&nights' to redirect. http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78
Moz Pro | | JohnTulley
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom&nights
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom=&nights= Any help would be welcome, thanks0 -
Domain.com and domain.com/index.html duplicate content in reports even with rewrite on
I have a site that was recently hit by the Google penguin update and dropped a page back. When running the site through seomoz tools, I keep getting duplicate content in the reports for domain.com and domain.com/index.html, even though I have a 301 rewrite condition. When I test the site, domain.com/index.html redirects to domain.com for all directories and root. I don't understand how my index page can still get flagged as duplicate content. I also have a redirect from domain.com to www.domain.com. Is there anything else I need to do or add to my htaccess file? Appreciate any clarification on this.
Moz Pro | | anthonytjm0 -
'Appropriate Use of Rel Canonical', Critical Factor but appears correct on page
Hi, Trying to get the following page ranked unsuccessfully.... http://www.joules.com/en-GB/2/Collections-Quilted-Jackets/c01c02.r16.1 Instead a product page is being ranked, shown below.... http://www.joules.com/en-GB/Womens-Quilted-Jacket/Navy/M_HAMPTON/ProductDetail.raction When I run the on page report card it advises that the Rel Canonical tag needs to point to that page, but we have checked and it looks to be doing that already. Has anyone else had an issue like this? Thanks, Martin
Moz Pro | | rockethot0