Moz Crawler suddenly reporting 1000s of duplicates (BE.net)
-
In the last 3-4 days we've had several thousand 'duplicate content' warnings appear in our crawl report, 99% of them related to our on-site blog. The blog is BlogEngine.Net, but the pages simply don't exist. The majority seem to be Roger trying quasi-random URLs like:
/?page=410/?page=151
Etc. etc. The blog will present content for these requests, but it is of course the same empty page since there's only unique content for up to /?Page=10 or so.
Two questions:
1. Did something change recently? These blogs have been up for months, and this problem has only come up this week. Did Roger change to become more aggressive lately?
2. Suggested remediation? On one of the blogs I've put no-index no-follow for any page that has a /?page querystring, and we'll see what effect that has come next crawl next week. However, I'm not sure this will work as per:
http://moz.com/community/q/functionality-of-seomoz-crawl-page-reports
Anyone else had dynamic blogs suddenly blossom into thousands of duplicate content warnings? Google (rightly) ignores these pages completely.
-
Hate to bump my own question, but it appears I spoke too soon about no-index,no-follow solving this. The duplicate errors went away for about 5 days, but then yesterday spiked with the same problem. I've confirmed that no-index, no-follow are present on the pages being detected as bad.
As per the best practices document:
http://moz.com/learn/seo/robotstxt
Using meta robots no index no follow is the recommended option:
Block with Meta NoIndex
This tells engines they can visit, but are not allowed to display the URL in results. This is the recommended method
But it apparently isn't working, as evidenced by the new surge of duplicate errors. Is there anything else I can do? I don't want to explicitly block Roger in robots.txt as that seems rather backward. Should Roger be included the Bad Robots List?
-
Peter -
Thanks for the clarification. I understand the philosophy at hand, and I kind of even understood it before I had asked the question. I'm handling these with a mix of canonical and no-index/no-robot.
Related to that, update:
By marking the superfluous pages no-index/no-follow the error count for the site has diminished by about 10,000 and the warning count by about 28,000 so that seems to be the way to go. The pages that had content are 'low value' in this context, since that content was readily available elsewhere.
-
Hi there!
Thanks for writing in with a great question.
We definitely count those dynamic URLs as duplicate content. While we are pretty sure that search engines can figure this stuff out and know which URL to index, it's still considered best practices to canonicalize or otherwise direct crawlers to the original URL (as far as I know. I'm not a professional SEO so you might be better off asking the Pro Q&A community at www.moz.com/community/q - they are all SEOs like you).
Since some dynamic URL generators can cause problems for crawlers, we do try to be overly-inclusive of these issues rather than overly-exclusive. We want people to know about potential issues with sites, even if they're not really issues in the scheme of the site owner's specific SEO implementation plan.
In sum, we'd rather leave those judgments up to you and at the same time, provide you with the data you need to make these decisions. I hope this helps explain our thinking here! However, if you think that our crawler might be having issues, and you do not want to post your site urls here you could always send us a support ticket at help@moz.com. That way can can examine it a bit further and provide some insights into why our crawler thinks this way!
Hope this helps!
Peter
Moz Help Team.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blog Analytics broke for two weeks. How do I account for these hits in my monthly report?
The Analytics on my blog broke for about two weeks before I realized it. I need to come up with some estimated numbers for total pageviews, referrals, direct hits, etc to add to my monthly SEM report. I took the average of the past four 2 week periods to come up with the number of hits. Should I just add this number to the total hits on my site? Or are these hits being counted twice if they went on to the main site? http://www.howlatthemoon.com
Reporting & Analytics | | howlusa0 -
Google Analytics Organic Search Keywords Suddenly Displaying FulL Urls
In my Google Analytics, the top keywords for Organic Search are suddenyl displaying full URLs. For example, now the third and fourth keywords are http://www.domain.com/highly-specific-URL. These have all started recently around the same day, July 12th. I've checked back, and we've made no internal changes to the site around that time that could affect this. Any thoughts on this? Thanks! P.S. It might be related to rich snippets, but I cannot tell at this point.
Reporting & Analytics | | 10SL0 -
Why are plus signs (+) suddenly showing up in Google Analytics organic search keywords reports?
Since June 13, 2013, the number of organic search queries containing a plus sign (+) has gone up over 1,000% compared to the previous period on my site in Google Analytics. These plus signs appear to be taking the place of spaces in these search queries (i.e. "word1+word2+word3"). This appears to be almost (or completely) Google organic traffic, not other search engines. Since I highly doubt searcher behavior would change so suddenly, I'm trying to figure out why Google is replacing spaces with plus signs. Is anyone else seeing this? Any ideas?
Reporting & Analytics | | RCF0 -
When will traffic data be working ? also whats with the spike in duplicate listing issues with everyone.
Hi There, We have no traffic data, is this something we are doing wrong or is this an issue with SEOMOZ ? Also duplicate listings have gone sky high, check goggle analytics's and all ok ? Any answers ? Thanks Charlie
Reporting & Analytics | | pro580 -
What services/reports to try during my free trial period?
Hi all - I am just starting the 2nd month of a two month free trial. So far I have run advanced reports snooping at the dofollow backlinks of other bloggers in my niche to get an idea of where they're getting their backlinks from. I have also been looking at top pages of lots of blogs in my niche to get an idea of what the most popular content is. What else should I be trying out during my free trial period? Note, I'm not selling anything directly on the site. I'm looking to increase visitors, comments etc.
Reporting & Analytics | | KateV0 -
Google Analytics Custom Reports eMails only one reporting tab
Goog Morning from 18 degrees C mostly cloudy Wetherby UIK 🙂 I'm on a mission to get away from manual reporting and move over to eMailing custom Google analytics reports. Thanks to this tutorail http://searchenginewatch.com/article/2175001/7-Time-Saving-Google-Analytics-Custom-Reports Ive made good progress but ive hit a snag. The snag being when i email the report the recipient only recieves one 1 report tab of data. This causes a problem when you eMail a report that has multiple reporting tabs like this: http://i216.photobucket.com/albums/cc53/zymurgy_bucket/tab-export-custom-problemcopy.jpg So my question is please..."When you eMail a recipient a custom report how can you ensure they receive "All" the reporting tabs data and not just one"? Any insights welcome 🙂
Reporting & Analytics | | Nightwing0 -
How are 301s reported in GA?
Does anyone have any insight on exactly how on-site 301s are reported in Google Analytics? My direct traffic seems to climb at the same rate as my organic with absolutely no off-line promotion. I have a suspicion that the 301s that I have built to re-coupe traffic being sent to old pages are being reported as direct. Any validity to this?
Reporting & Analytics | | NextGenEDU0 -
Nofollow page is being reported as a landing page for organic search in Google Analytics
One of my client's websites includes a series of pages for an enrollment process. All of these pages are blocked by robots.txt. In Google Analytics these pages are showing data as landing pages for organic search traffic, and have been for quite some time. There was recently a surge of organic search traffic landing on one of these pages, coming from multiple search engines. The pages appear to be blocked and I'm not finding any of them in the search results for the keywords that are being reported in GA or by searching for the url. Does anyone have any insights into why this might be happening?
Reporting & Analytics | | rgibson1000