Rogerbot's crawl behaviour vs google spiders and other crawlers - disparate results have me confused.
-
I'm curious as to how accurately rogerbot replicates google's searchbot
I've currently got a site which is reporting over 200 pages of duplicate/titles content in moz tools. The pages in question are all session IDs and have been blocked in the robot.txt (about 3 weeks ago), however the errors are still appearing.
I've also crawled the page using screaming frog SEO spider. According to Screaming Frog, the offending pages have been blocked and are not being crawled. Webmaster tools is also reporting no crawl errors.
Is there something I'm missing here? Why would I receive such different results. Which one's should I trust? Does rogerbot ignore robot.txt? Any suggestions would be appreciated.
-
Thanks for your response. I was beginning to think this question had been left to rot.
I'm not getting any errors in WMT. What is concerning is that Roger is returning almost 300 errors of dupe content, which is obviously a problem. Screaming frog is no longer finding the pages (they've been blocked in the robot.txt) I guess what I'm trying to ask here is how can I be sure that my dupe content has been effectively blocked from google's spider.
Is there anyway to check?
Thanks for your help.
-
I've see similar concerns from others, it seems "rogerbot" does ignore certain things that other bots consider.
Don't worry about it, if it's not being flagged in WMT it shouldn't be an issue.
Take Roger as a guide rather than an iron fist bot like googlebot.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My website was at the top of Google search for some years... suddenly I almost can't reach first page! Moz ranks my website better than the competitors... what might be going one? Could anybody help me out? Thanks!
Hello Guys! My website was at the top of Google search for some years... suddenly I almost can't reach first page! Moz ranks my website better than the competitors... what might be going one? Could anybody help me out? Moz rank us grade A... the competitors B or C .. I think we have better back links than they do... Would you need any kind of data or report to help me here? Thanks!
Moz Pro | | wesleyms0 -
Why doesn't Moz crawl whole pages of our website to report All On-Page issues?
Hi friends & mozzers, How can't Moz crawl whole pages of our website: https://www.4atvtires.com/ to report All Serious On-Page issues. We have more than 15000 product pages. And how could it be possible that Moz isn't able to crawl whole, just got crawl report upto 258 pages of our website, and also I can experience the same in Google webmaster ?? Please help to fix this issue as early as possible. Regards,
Moz Pro | | BigSlate
Rann0 -
How do i fix the problem of having 2 url's splitting my rankings?
please excuse my noobness. i have a nice site, www.soundsenglish.com which I built from scratch and learned by doing. It has lots of nice content and it does ok, my rankings are woeful mostly cos of all the mistakes i made building it...i'll fix that stuff. This stuff i don't know about. from my adsense i get 2 listings www.soundsenglish.com and soundsenglish.com wierdly the second one gets consistently higher paying ads although most of the visitors come through the first but they are both the same landing page same content -as far as i can tell. when i try to find rankings, use the seo tools etc i get diferent scores, so whatever it is, it is splitting the sites - can't be a good thing. i have no idea why this happens and i have some inkling that maybe i need something to do with cannonical redirects or maybe a 301 redirect. both of which i have little idea how to do. If that isn't enough naive blundering about for you, i have a little more... it occurs to me that this prpoblem is probably happening with every page on my site, i.e. the 'juice ' is not getting credited onto that one page. this surely means cannonical redirects but even afterreading up on them idon't quite get it. or rather ido but idon;t get how to apply it to my context.
Moz Pro | | soundsenglish0 -
Integration of Google+, LinkedIn, YouTube?
I ran across a blog on Moz from last November that was introducing the social monitoring feature. It mentioned adding new social platforms that would eventually be included in SEOMoz. Any word on when/if that is happening? Excited to have SEO and Social under the same dashboard! Jake
Moz Pro | | AESEO2 -
Garbled URL's in Private Messages.
Every time I try to put a url in a Private message it gets garbled up with extra chars and then won't go to the right place. http://www.facebook.com/pages/Mariah-Carle-Photography Becomes: http://www.facebook.com/pages/Mariah58973jhsdfui-Carle%8594743Photography Ok after that test I deliberately garbled a url and it STILL work in the open forums....
Moz Pro | | Mcarle0 -
Crawl Errors Confusing Me
The SEOMoz crawl tool is telling me that I have a slew of crawl errors on the blog of one domain. All are related to the MSNbot. And related to trackbacks (which we do want to block, right?) and attachments (makes sense to block those, too) ... any idea why these are crawl issues with MSNbot and not Google? My robots.txt is here: http://www.wevegotthekeys.com/robots.txt. Thanks, MJ
Moz Pro | | mjtaylor0 -
Filtering OSE Results
Issue: When I export OSE Linking Pages results to .CSV, I'd like to filter only unique domains. It seems to me that there should be a way to do this with Excel, perhaps pivot table. Anyone have a quick solution for this? Alternatively, I can use Linking Domains, which gives all unique roots, but then i lose follow/nofollow filter. Thoughts?
Moz Pro | | Gyi0 -
Initial Crawl Questions
Hello. I just joined and used the Crawl tool. I have many questions and hoping the community can offer some guidance. 1. I received an Excel file with 3k+ records. Is there a friendly online viewer for the Crawl report? Or is the Excel file the only output? 2. Assuming the Excel file is the only output, the Time Crawled is a number (i.e. 1305798581). I have tried changing the field to a date/time format but that did not work. How can I view the field as a normal date/time such as May 15, 2011 14:02? 3. I use the ™ symbol in my Title. This symbol appears in the output as a few ascii characters. Is that a concern? Should I remove the trademark symbol from my Title? 4. I am using XenForo forum software. All forum threads automatically receive a Title Tag and Meta Description as part of a template. The Crawl Test report shows my Title Tag and Meta Description as blank for many threads. I have looked at the source code of several pages and they all have clean Title tags and I don't understand why the Crawl Report doesn't show them. Any ideas? 5. In some cases the HTTP Status Code field shows a result of "3". Why does that mean? 6. For every URL in the Crawl Report there is an entry in the Referrer field. What exactly is the relationship between these fields? I thought the Crawl Tool would inspect every page on the site. If a page doesn't have a referring page is it missed? What if a page has multiple referring pages? How is that information displayed? 7. Under Google Webmaster Tools > Site Configurations > Settings > Parameter Handling I have the options set as either "Ignore" or "Let Google Decide" for various URL parameters. These are "pages" of my site which should mostly be ignored. For example a forum may have 7 headers, each on of which can be sorted in ascending or descending order. The only page that matters is the initial page. All the rest should be ignored by Google and the Crawl. Presently there are 11 records for many pages which really should only have one record due to these various sort parameters. Can I configure the crawl so it ignores parameter pages? I am anxious to get started on my site. I dove into the crawl results and it's just too messy in it's present state for me to pull out any actionable data. Any guidance would be appreciated.
Moz Pro | | RyanKent0