Why doesn't Moz crawler follow robots.txt?
-
It is crawling the entire site, and there is stuff we do not want it to. Please advise.
-
Which I am ok with, but why am I getting duplicate content?
-
Yes, it doesn't tell them which pages not to crawl - just not to index them
-
It has been used correctly. The site is a Magento site and they have it built in. There are a lot of filters for products so it uses rel=canonical to tell Google which to index.
-
rel=canonical is not really an robots instruction file - rel=canonical is to help with duplicate copy where you have the same or similar pages and your telling search engines which pages is the preferred page.
If you don't want pages crawling you have to tell Search engines in the robots file
-
Hi There,
Rel=canonical tags tell robots, which page is actually to index out of many.
For SEOs, canonicalization refers to individual web pages that can be loaded from multiple URLs. This is a problem because when multiple pages have the same content but different URLs, links that are intended to go to the same page get split up among multiple URLs. This means that the popularity of the pages gets split up. Unfortunately for web developers, this happens far too often because the default settings for web servers create this problem.
https://moz.com/learn/seo/canonicalization
I feel you have not used it correctly, check the above article and see if it helps.
Thanks,
Vijay
-
So I made a mistake it isn't the robots.txt that is the issue. I am getting hit with a ton of duplicate content penalties so I figured that was it. The problem is that I have pages with rel=canonical tags that it is ignoring. Does Roger not read those?
-
Hi
Have to agree with the above, Rogerbot does listen to robot.txt file, unlike Bing - while they are getting better Bing ignores the robots.txt file frequently.
Ive analysed quite a few server logs over the years and Roger has always listened to the file - its usually a mistake the in the robots file.
There is an option to test your robots.txt file in GCS - while this is testing to see if Google will crawl the page - usually Roger has the same instructions as Google.
However if you are still pretty certain that Roger is ignoring robots.txt please DM your Server Logs and your website and I will take a look and analyse it for you (free of course).
Thanks
Andy
-
All major search engines, including Moz's crawler Rogerbot and Internet Archives, respect Robots.txt as a standard “robots exclusion protocol” to communicate with web crawlers and web robots.
In case you wish to exclude some specific information from all Search Engines, you can use the following sample code as reference to block specific directories.
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/However, if you want to specifically block Mz's Rogerbot from crawling specific sections of your website. You may take the following reference code to block specific areas / directories in your website from rogerbot:
User-agent: Rogerbot
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/I hope this helps, If you have specific questions, please feel free to respond, I will be happy to answer them.
Regards,
Vijay
-
Hi there! Moz's crawler, rogerbot, does follow robots.txt. When he's not following robots.txt, it's usually because the robots.txt protocol is formatted improperly. Learn more about formatting your page here: https://moz.com/learn/seo/robotstxt
For more information on Roger, including how to block him, head here: https://moz.com/help/guides/moz-procedures/what-is-rogerbot
And if you want to test your formatting, try the Robots Checker here: https://support.google.com/webmasters/answer/6062598
If you're still unable to determine why rogerbot is crawling your site, feel free to write in to help@moz.com!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Looking for someone from Moz to comment on unrealistic spam score
Two years ago I bought domain name aroundtheworldwithme.com as a travel blog. I built the site up slowly and currently have a DA of 28 with decent Google search results. However, according to Moz my spam score is 43%. I am convinced that something funny is going on to give me this spam score. I have gone though all 27 factors that play into the spam score in great detail. I only fail a few of the checks. These are the Double Click Tag, LinkedIn profile, phone number, email, and Facebook Pixel. Which as far as I know, literally zero travel blog websites provide this info. So I am on par with evert other travel blog website. Now I know Moz will say this "doesn't mean your website is spam, just that our algorithm found that websites with similar attributes are spam." But this is completely bogus. All similar websites to mine have spam scores of 1-2%. All other websites I see with spam scores over 40% are literally spam websites. Why am I literally the only legitimate travel blog site that has a spam score over 40%? My backlink profile is similar to all other travel blogs. I actually have less spammy links as most, as I haven't been around too long. So I don't think my backlinks are causing the high spam score. The only thing I can think of is that my domain name used to be owned by someone else. I have a lot of backlinks from a random blog website that were discovered in 2018, three years before I bought the domain. Is it possible that the domain used to be an actual spam site name and I am now being punished for that? If not, then I cannot think of anything that would cause my high spam score other than fundamental defects in the Moz spam score algorithm. Something is going on, and I'd love someone from Moz to actually be able to have a look at my website and tell me why I have such a high spam score. I know Google doesn't care about Moz's spam score (thankfully) but other websites don't want links from me due to my completely bogus spam score. Thanks everyone
Link Explorer | | Heckmantis
aroundtheworldwithme.com0 -
Unsolved Why is Moz so bad at finding lost backlinks?
0 -
Domain Authority (DA) in Moz Pro changed only within the last 1-2 months?
Has anyone noticed that the Domain Authority (DA) as reported in Moz Pro has changed only within the last 1-2 months? We have screen shots showing plots of DA vs competitors w/ line graph 2 months ago starting in NOV 2017 which today starting JAN 2018 and comparing shows DA up to 50% different!
Link Explorer | | Amplitude_Digital
The change is seen both in the Links Overview and under the Spam Score sections still marked "NEW". Can Moz confirm that it's only recently within the last 2 months that in Moz Pro the NEW DA numbers have retroactively been updated even though the new Link Explorer has been publicly out since APR 30 from https://moz.com/community/q/moz-s-new-link-explorer-including-our-revamped-index-and-da-pa-scores-is-now-open-to-everyone? Look at the top green line starting ~12 months ago on both graphs, w/ old below 40 and new above 50. We've seen even greater differences for other tracked domains. Thanks! view0 -
Is the keyword CTR provided in Moz dashboard average?
Hi, I'm a bit confused with Keyword CTR provided in Moz Dashboard. Is its an average of search positions or top 3 positions of SERPs?
Link Explorer | | NishilP1 -
Moz's new Link Explorer displaying the DA marginally less than Site Explorer
Moz's new Link Explorer displaying the DA marginally less than Site Explorer. Old one is showing it 46 while new link explorer is showing the DA as 40.
Link Explorer | | dhananjay.kumar11 -
Is Moz's backlink checker.... just... not good?
Hey everyone! Can somebody explain to me why this keeps happening: Whenever I'm trying to backlink my competitors, I typically use RavenTools. Every time, without fail, if I put that same URL into Moz's Open Site Explorer - It gives me about 1/20th of what RavenTools shows me. Sometimes it literally comes up with 2 or 3 links total. Unfortunately, RavenTools has a cap on how many backlink checks you can perform in a month - so once I've used those up, I have to start using OSE... But, it just doesn't work. Does anyone else have this issue? Thanks!
Link Explorer | | TaylorRHawkins1 -
Moz error with domain and page authority if no www in my domain
Hello, so our domain is website.com and we do not use www in our domain. Now moz shows our page authority as huge (website. Com) and domain authority very low. But our domain is our page that we rank. Is this due to the fact that we use no www and we rank our main index page? Am I correct? Thanks!
Link Explorer | | advertisingcloud0 -
Error Code 612 with robots.txt 200
Hi! I am getting this message Error Code 612: Error response for robots.txt, so the crawler do not check any page of the site. The status code for the robots.txt is 200 and it does not seem Googlebot has any problem crawling the site, so I don't know what the matter is. The site is http://www.musicopolix.com/ Thanks so much in advance for any help!
Link Explorer | | Musicopolix0