Duplicate Content Report: Duplicate URLs being crawled with "++" at the end
-
Hi,
In our Moz report over the past few weeks I've noticed some duplicate URLs appearing like the following:
Original (valid) URL:
http://www.paperstone.co.uk/cat_553-616_Office-Pins-Clips-and-Bands.aspx?filter_colour=Green
Duplicate URL:
http://www.paperstone.co.uk/cat_553-616_Office-Pins-Clips-and-Bands.aspx?filter_colour=Green**++**
These aren't appearing in Webmaster Tools, or in a Screaming Frog crawl of our site so I'm wondering if this is a bug with the Moz crawler? I realise that it could be resolved using a canonical reference, or performing a 301 from the duplicate to the canonical URL but I'd like to find out what's causing it and whether anyone else was experiencing the same problem.
Thanks,
George
-
So glad to help, George!
-
Hi Chiaryn,
Thanks - you've been really helpful! I had assumed that as the referrer wasn't in the Web UI (per WMT), it wasn't available anywhere. I'd also assumed it was a copywriting issue and not a product data issue.
Need to readdress my assumptions
George
-
Hey George,
Thanks for writing in.
I looked into the pages with the ++ in the URL and it seems that they do actually exist on the site, so it isn't an issue with our crawler that is causing these in your crawl errors. For example, a link to the URL http://www.paperstone.co.uk/cat_553_Desktop-Essentials.aspx?filter_colour=Green++ can be found in the source code of the page http://www.paperstone.co.uk/cat_553_Desktop-Essentials.aspx here: http://screencast.com/t/HpHTlSs5gH8H
You can find the referral pages for the ++ pages on the site by downloading the Full Crawl Diagnostics CSV. In the first column, perform a search for the ++. When you find the correct row, look in the column labeled referrer, AM. This tells you the referral URL of the page where our crawlers first found the URLs that include ++. You can then visit this URL to find the links to those pages.
Since these URLs with the ++ do resolve with a 200 http status and they have the same code and content as the pages without the ++, our crawler will count them as duplicate content. I'm not certain why Screaming Frog and GWT are not find or reporting these pages; it may be that they parse the + signs in the URL differently than our crawler does.
As Keri and bishop23 mentioned, this is most likely not a major issue if GWT isn't reporting the errors, but we prefer to report the issues because we would rather be safe than sorry.
I hope this helps. Please let me know if you have any other questions.
Chiaryn
-
I'm not seeing an answer that jumps out at me for this one. For the immediate future, don't sweat it if you're not seeing it in GWT. This is assigned to our help desk, and we'll have someone from there investigate more and get back to you, though it might be a few days because of the Thanksgiving holiday (if you don't get an answer today, it may be Monday before we have a chance to respond).
-
If they're not appearing on WMT than you should ignore unless it's an exact duplicated content, then delete
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have a draft campaign that isn't going active no matter how many times I "finish." Please help..
How do I get my campaign to go active? I am currently using the 30-day free trial.
Product Support | | JOhnClarkMartin0 -
Duplicate Content on a page that isn't duplicate?
So, I've been working on a site for a while, and they recently had one of their blog posts marked as "duplicate content" on Moz Pro: https://ashlandbreastpumps.com/blog/breast-milk-storage-guidelines/ Reviewing this post, there's nothing here to suggest that it should be considered duplicate. Link is marked canonical, there are no weird domain shenanigans to cause confusion, the content is distinct, and there's a lot of content there that prevents it from being overcome by coding white noise (So "more content" wouldn't be a solution here). So I'm trying to figure out why this particular post was flagged. Did Moz Pro make a mistake?
Product Support | | YYSeanBrady0 -
Keyword ranking is different from rankings report
I just recently posted a blog with the hopes on winning the search "what is sharepoint" I am tracking that query in Moz. I checked this weekend and it looks like I did land in the SERP's postion 23- 26. My colleague a couple hours away from me can see it too in a similar position, and in the rankings I can see it in position 24 when I click on the SERP's spot in moz, but it does not show my position as a rank # Any idea why? Thanks,
Product Support | | AvePoint
Amanda1 -
Crawl error robots.txt
Hello, when trying to access the site crawl to be able to analyze our page, the following error appears: **Moz was unable to crawl your site on Nov 15, 2017. **Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. Can help us? Thanks!
Product Support | | Mandiram0 -
Reports?
Is anyone else missing March 2014 monthly reports? Also, I didn't find a "contact us" email in footer or header nav of site. Is there a simple support email I can just send this question to? Thanks anyone in advance.
Product Support | | colejolley0 -
How to get in contact with moz about my report content?
Just logged into moz to view my reports as i got an email about a new report went into my dashboard and nothing is coming up. Is there any way i can get in direct contact with moz?
Product Support | | meteorelectrical0 -
"We are collecting your traffic data now!" message displayed for many days
I am getting this message: "We are collecting your traffic data now! You should see your traffic metrics here within the next 24 to 48 hours." for more then 3 days now and I'm starting to worry... Is this showing only to me or is a delay on all MOZ accounts?
Product Support | | SorinaDascalu0 -
Duplicate Page Content Report on Moz - Still ranking in Google Results?
Hi, I am experiencing 2 issues with the Duplicate Page Content in Moz. Every week it is notifying me of new duplicate content - so it seems to be missing duplicate content each week and the crawl is never above the 5000-6000 page mark, which means that it is under 10K which is the limit, so presumably everything is crawled in that one go so surely it should detect all of the dupe content on 1 crawl as opposed to having to do various crawls to detect it no? The dupe content report shows me pages that indeed have duplicate content but after checking, these are ranking on Google for their terms... ? Is the duplicate page content report incremental ? Will it add more duplicate content and increase the report every week or does it just show that week's results? If so, it's a bit like chasing my tail... Help!
Product Support | | bjs20100