A question about Mozbot and a recent crawl on our website.
-
Hi All,
Rogerbot has been reporting errors on our website's for over a year now, and we correct the issues as soon as they are reported.
However I have 2 questions regarding the recent crawl report we got on the 8th.
1.) Pages with a "no-index" tag are being crawled by roger and are being reported as duplicate page content errors. I can ignore these as google doesnt see these pages, but surely roger should ignore pages with "no-index" instructions as well? Also, these errors wont go away in our campaign until Roger ignores the URL's.
2.) What bugs me most is that resource pages that have been around for about 6 months have only just been reported as being duplicate content. Our weekly crawls have never picked up these resources pages as being a problem, why now all of a sudden? (Makes me wonder how extensive each crawl is?)
Anyone else had a similar problem?
Regards
GREG
-
Its pretty big
Over 1000 Pages in the index, and many more internal URLs to crawl that have a no-index tag. (booking forms etc)
Ill see if we can archive our other campaigns and let roger crawl our main site properly.
-
How big is your website Greg ?
-
Thanks Nakul,
I do a weekly scan with Xenu which doesn't have a URL limit like SF.
I was under the impression a full scan of the site was done each week, but as you say, its being scanned in chunks, divided across our 3 other websites.
If this is the case, it would be great to let Mozbot know were to crawl to avoid unnecessary resources being used up when it could be scanning our most important pages.
Greg
-
Greg The crawl is limited to 10,000 (Total) for all your 5 campaigns. As far as whether or not Roger-Bot should ignore Noindex - Here's what I think - I think the intent of that tool here is to find issue. In this scenario, Roger bot is making sure you are aware of the fact that some of those pages have a noindex. Roger does not know whether it's intentional or not. You can also do a deeper crawl and do a deep dive into your website by using Screaming Frog SEO Spider http://www.screamingfrog.co.uk/seo-spider/ It does a great job of doing a deep crawl when you want it since it's a desktop software and you can set all sorts of options and identify issues.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Still Cant Crawl My Site
I've removed all blocks but two from our htaccess. They are for amazonaws.com to block amazon from crawling us. I did a fetch as google in our WM tools on our robots txt with success. SEOMoz crawler here hit's our site and gets a 403. I've looks in our blocked request logs and amazon is the only one in there. What is going on here?
Moz Pro | | martJ0 -
Crawl Diagnostics 403 on home page...
In the crawl diagnostics it says oursite.com/ has a 403. doesn't say what's causing it but mentions no robots.txt. There is a robots.txt and I see no problems. How can I find out more information about this error?
Moz Pro | | martJ0 -
Crawl Diagnostics - Crawling way more pages than my site has?
Hello all, I'm fairly new here, more of a paid search guy dabbling in SEO on the side. I have a client that I have in SEOMoz and the Crawl Diagnostics report is showing 10,000+ pages crawled and I think the site has at most 800 pages (e-commerce site using freewebstore.org as the platform). Any reasons this would be happening?
Moz Pro | | LodestoneGen0 -
Best Way to Include Social Media in Website?
Hi, how are you doing? I am new to MOZ, totally love it. I recently developed the social media for my page (www.aceromart.com) in Facebook, Google + and Twitter. (I am from Mexico, so the website is in Spanish) I am no expert in SEO whatsoever, but i like to engage my customers with great content both in my page and social media. My question is: **What the best way to include your social media links or icons on my page. Is there a program or a way to include the links. I want the people that visit ** **Should you include them in every page?, in a footer?, with icons or links. ** Thanks in advance for your advices, they are greatly appreciated. Best Regards, Jesus D
Moz Pro | | JesusD0 -
Mobile Website Resources
Hey everyone, Can you please recommend great resources for building mobile website and using proper SEO techniques for mobile? Just a list of resources would be great. I understand that this is SEO forum so at least basics for mobile SEO would do. I'm currently using http://www.howtogomo.com and WPTouch PRO (for WordPress) but would love to learn to build mobile sites myself, at least with templates or basic tools provided. Just want to know what's there to know and how hard it is. And if I can handle it - what SEO practices for mobile I should keep in mind. Thank you! Max
Moz Pro | | MaxMinzer0 -
Set crawl frequency
Current crawl frequency is weekly, is it possible for me to set this frequency our-self?
Moz Pro | | bhanu22170 -
I have another Duplicate page content Question to ask.Why does my blog tags come up as duplicates when my page gets crawled,how do I fix it?
I have a blog linked to my web page.& when rogerbot crawls my website it considers tags for my blog pages duplicate content.is there any way I can fix this? Thanks for your advice.
Moz Pro | | PCTechGuy20120 -
Only Crawling 1 page?
Hi Guys, Any advice much appreciated on this! Recently set up a new campaign on my dashboard with just 5 keywords. The domain is brammer.co.uk and a quick Google site:brammer.co.uk shows a good amount of indexed pages. However - first seomoz tool crawl has only crawled 1 url!! "Last Crawl Completed: Apr. 12th, 2011 Next Crawl Starts: Apr. 17th, 2011" Any ideas what's stopping the tool crawl anymore of the site?? Cheers in advance.. J
Moz Pro | | lovealbatross0