Duplicate Content Report: Duplicate URLs being crawled with "++" at the end
-
Hi,
In our Moz report over the past few weeks I've noticed some duplicate URLs appearing like the following:
Original (valid) URL:
http://www.paperstone.co.uk/cat_553-616_Office-Pins-Clips-and-Bands.aspx?filter_colour=Green
Duplicate URL:
http://www.paperstone.co.uk/cat_553-616_Office-Pins-Clips-and-Bands.aspx?filter_colour=Green**++**
These aren't appearing in Webmaster Tools, or in a Screaming Frog crawl of our site so I'm wondering if this is a bug with the Moz crawler? I realise that it could be resolved using a canonical reference, or performing a 301 from the duplicate to the canonical URL but I'd like to find out what's causing it and whether anyone else was experiencing the same problem.
Thanks,
George
-
So glad to help, George!
-
Hi Chiaryn,
Thanks - you've been really helpful! I had assumed that as the referrer wasn't in the Web UI (per WMT), it wasn't available anywhere. I'd also assumed it was a copywriting issue and not a product data issue.
Need to readdress my assumptions
George
-
Hey George,
Thanks for writing in.
I looked into the pages with the ++ in the URL and it seems that they do actually exist on the site, so it isn't an issue with our crawler that is causing these in your crawl errors. For example, a link to the URL http://www.paperstone.co.uk/cat_553_Desktop-Essentials.aspx?filter_colour=Green++ can be found in the source code of the page http://www.paperstone.co.uk/cat_553_Desktop-Essentials.aspx here: http://screencast.com/t/HpHTlSs5gH8H
You can find the referral pages for the ++ pages on the site by downloading the Full Crawl Diagnostics CSV. In the first column, perform a search for the ++. When you find the correct row, look in the column labeled referrer, AM. This tells you the referral URL of the page where our crawlers first found the URLs that include ++. You can then visit this URL to find the links to those pages.
Since these URLs with the ++ do resolve with a 200 http status and they have the same code and content as the pages without the ++, our crawler will count them as duplicate content. I'm not certain why Screaming Frog and GWT are not find or reporting these pages; it may be that they parse the + signs in the URL differently than our crawler does.
As Keri and bishop23 mentioned, this is most likely not a major issue if GWT isn't reporting the errors, but we prefer to report the issues because we would rather be safe than sorry.
I hope this helps. Please let me know if you have any other questions.
Chiaryn
-
I'm not seeing an answer that jumps out at me for this one. For the immediate future, don't sweat it if you're not seeing it in GWT. This is assigned to our help desk, and we'll have someone from there investigate more and get back to you, though it might be a few days because of the Thanksgiving holiday (if you don't get an answer today, it may be Monday before we have a chance to respond).
-
If they're not appearing on WMT than you should ignore unless it's an exact duplicated content, then delete
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Crawling error emails
Recently we start having random error messages about crawling issue:
Product Support | | DTashjian
2024-08-30 edweek:Ok
2024-08-29 marketbrief:Err. advertise: Err, edweek:Err, topschooljobs:Ok
2024-08-23 edweek:Ok
2024-08-22 marketbrief:Err. advertise: Err, edweek:Err
2024-08-21 topschooljobs:Ok, edweek:Ok
2024-08-15 marketbrief:Ok. advertise:OK
2024-08-13 edweek:Ok
2024-08-12 marketbrief:Ok
2024-08-08 marketbrief:Ok, advertise:Ok
2024-08-03 edweek:Ok, topschooljobs:Ok
All for 2024-07 - are Ok Yesterday I set 2 more crawls for the same sites (edweek and marketbrief) and I get a morning email about original edweek site is ok (still have some problem but crawl occurs and all is fine) but for test crawl for the same site "EW Test" I just got error email.
Also I suppressed ALL email communications and frankly surprised by this email. Can you please check what is wrong with a crawler or stat collection or I don't know who produced the issues.0 -
Unsolved Crawling only the Home of my website
Hello,
Product Support | | Azurius
I don't understand why MOZ crawl only the homepage of our webiste https://www.modelos-de-curriculum.com We add the website correctly, and we asked for crawling all the pages. But the tool find only the homepage. Why? We are testing the tool before to suscribe. But we need to be sure that the tool is working for our website. If you can please help us.0 -
Duplicate Content on a page that isn't duplicate?
So, I've been working on a site for a while, and they recently had one of their blog posts marked as "duplicate content" on Moz Pro: https://ashlandbreastpumps.com/blog/breast-milk-storage-guidelines/ Reviewing this post, there's nothing here to suggest that it should be considered duplicate. Link is marked canonical, there are no weird domain shenanigans to cause confusion, the content is distinct, and there's a lot of content there that prevents it from being overcome by coding white noise (So "more content" wouldn't be a solution here). So I'm trying to figure out why this particular post was flagged. Did Moz Pro make a mistake?
Product Support | | YYSeanBrady0 -
Moz report not updated
Hi, Our Moz report does not seem to have updated this week? Thanks,
Product Support | | Ski-Solutions
Sacha1 -
Still no invite to site crawl beta! Why bother?
Well, I was informed that I was en-queue to be invited to the Moz Site Crawl v2. I have several client sites making use of SNI b/c, well... CDN's. What is the point of telling me I may receive an invitation shortly, then hearing nothing back and not being able to crawl their sites... this makes this service 100% useless as I can simply use a couple of different tools (free) to perform the same tasks... don't get me wrong... I would rather use Moz and this is not intended to flame the service as I think it could be great... if only it worked. I cannot justify the lack of response, nor the lack of service (what we intended to use here) for the price. It seems like this is simply a waiting game wherein Moz expects me to pay for this service and THEN I will receive my invite? Is it at all possible that anyone can look into this and/or my invite status. If I cannot sample these features before long, you've lost a solid potential client. (Not my loss)
Product Support | | jmsdonline0 -
Reports Issues
Hello there, I recently re-activated my account and I have some issues with the reports. I have been notified by email that the crawl has been successful and data were collected but they refer to January and February instead of November. What should I do? Thanks
Product Support | | PremioOscar0 -
Why can't the dates be changed in automatic reports?
I want to be able to change the publish date of automatic reports in my campaigns. One such campaign, which is a client campaign, it's set to run on the 8th after I selected "monthly". However, this doesn't work for me as this client want's to meet each month on between the 2nd and 5th of each month and I have to have this report data. So, I need to run this report on the first. Not the 2nd, not every 4 weeks... on the first. It seems like you guys have a fundamental flaw in the design of this tool, as great as it is. You've set the projects to auto run each month from the date it was added (at least from what I can tell). Provided that's true, then this would explain why the monthly reports won't work on my schedule because their on a weekly schedule instead. We, as clients, should be given the option to schedule when our scanning runs, when our reports get generated, etc. Every company runs their SEO and marketing differently, but the away you've set this up with the lack of options for us, forces us to work around your tools scheduling and not the other way around. Also, out of all of the SEO tools I've tried (quite a few), none have had this limitation. This should be addressed immediately. Thanks, Micha
Product Support | | Multiverse-Media-Group1 -
Anyone else receiving several duplicate Moz Weekly Rankings Report today?
Anyone else receiving several duplicate Moz Weekly Rankings Report today? I think I have received about 15-20 email duplicate reports on my weekly rankings. David
Product Support | | David-E-Carey0