Why doesn't Moz crawler follow robots.txt?
-
It is crawling the entire site, and there is stuff we do not want it to. Please advise.
-
Which I am ok with, but why am I getting duplicate content?
-
Yes, it doesn't tell them which pages not to crawl - just not to index them
-
It has been used correctly. The site is a Magento site and they have it built in. There are a lot of filters for products so it uses rel=canonical to tell Google which to index.
-
rel=canonical is not really an robots instruction file - rel=canonical is to help with duplicate copy where you have the same or similar pages and your telling search engines which pages is the preferred page.
If you don't want pages crawling you have to tell Search engines in the robots file
-
Hi There,
Rel=canonical tags tell robots, which page is actually to index out of many.
For SEOs, canonicalization refers to individual web pages that can be loaded from multiple URLs. This is a problem because when multiple pages have the same content but different URLs, links that are intended to go to the same page get split up among multiple URLs. This means that the popularity of the pages gets split up. Unfortunately for web developers, this happens far too often because the default settings for web servers create this problem.
https://moz.com/learn/seo/canonicalization
I feel you have not used it correctly, check the above article and see if it helps.
Thanks,
Vijay
-
So I made a mistake it isn't the robots.txt that is the issue. I am getting hit with a ton of duplicate content penalties so I figured that was it. The problem is that I have pages with rel=canonical tags that it is ignoring. Does Roger not read those?
-
Hi
Have to agree with the above, Rogerbot does listen to robot.txt file, unlike Bing - while they are getting better Bing ignores the robots.txt file frequently.
Ive analysed quite a few server logs over the years and Roger has always listened to the file - its usually a mistake the in the robots file.
There is an option to test your robots.txt file in GCS - while this is testing to see if Google will crawl the page - usually Roger has the same instructions as Google.
However if you are still pretty certain that Roger is ignoring robots.txt please DM your Server Logs and your website and I will take a look and analyse it for you (free of course).
Thanks
Andy
-
All major search engines, including Moz's crawler Rogerbot and Internet Archives, respect Robots.txt as a standard “robots exclusion protocol” to communicate with web crawlers and web robots.
In case you wish to exclude some specific information from all Search Engines, you can use the following sample code as reference to block specific directories.
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/However, if you want to specifically block Mz's Rogerbot from crawling specific sections of your website. You may take the following reference code to block specific areas / directories in your website from rogerbot:
User-agent: Rogerbot
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/I hope this helps, If you have specific questions, please feel free to respond, I will be happy to answer them.
Regards,
Vijay
-
Hi there! Moz's crawler, rogerbot, does follow robots.txt. When he's not following robots.txt, it's usually because the robots.txt protocol is formatted improperly. Learn more about formatting your page here: https://moz.com/learn/seo/robotstxt
For more information on Roger, including how to block him, head here: https://moz.com/help/guides/moz-procedures/what-is-rogerbot
And if you want to test your formatting, try the Robots Checker here: https://support.google.com/webmasters/answer/6062598
If you're still unable to determine why rogerbot is crawling your site, feel free to write in to help@moz.com!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz was unable to crawl your site on Jun 22, 2020\. We were unable to access your site due to a page timeout on your robots.txt, which prevented us from crawling the rest of your site.
Site: www.kpmg.us Getting robots.txt timeout fail since 02/29/20. We've checked our server logs and see no errors. Went through all the steps of the "Troubleshooter". Updated robots.txt to allow rogerbot full access: User-agent: rogerbot
Link Explorer | | KPMG-Search-Social
Disallow: Any ideas how to get roger to crawl my site????1 -
How Does Moz Spam Score Increase and Decrease?
Hello Moz Community, I'm facing spam score issue on my site General Queen according to MOZ my website spam score is 30% but anybody can explain how can i decrease it? Firstly I disavowed all low-quality and high spam score links (6 months ago) but my spam score was not decreased after that I created some high-quality and 0 spam score backlinks but still, my spam score is same 😞 According to Moz link explorer, my domain backlinks spam score ratio was below; Stats of 14 July 2019 .. Screen shot is attached 1-30% .. 45.3%
Link Explorer | | BloggerSEObd
31-60% .. 32.1%
61-100% .. 22.6% Stats of 13 June 2020 .. Screen shot is attached 1-30% .. 65.1%
31-60% .. 30.7%
61-100% .. 4.2% You can see a huge difference in the backlinks spam score in it. But why my spam score is still 30%? Looking forward to your reply. Thanks & Regards, abu1NiB XGLkr5z1 -
Moz's new Link Explorer, including our revamped index and DA/PA scores is now open to everyone!
Hey Moz Community, Link Explorer is now open to the public! Everyone can access it via a subscription or a free Moz ‘Community’ account. As you may know by now, the brand-new Link Explorer tool is primed to replace Open Site Explorer as Moz’s link building and analysis tool. The Link Explorer project is the result of an incredible amount of perseverance and hard work by the team, and we’re proud to be able to finally share it with you — we know it’s going to revolutionize how you approach link building and make your job easier. You can read more about the tool here in Sarah Bird’s announcement post. Because Link Explorer improves on almost every aspect of Open Site Explorer, the metrics have improved, too. That means you’re likely going to see some Domain Authority and Page Authority discrepancies between OSE’s index and Link Explorer’s index. We definitely suggest you use the new DA/PA from Link Explorer, as they’re more accurate and refresh daily rather than monthly, as was the case with OSE’s index. However, we also realize that many of you use these metrics to report to your clients and colleagues, and a sudden change or fluctuation could potentially make your job harder. Which DA is the real DA? The new DA is based on a much larger index that has many improvements, several of which are designed to make the index more like Google’s than ever before. You should consider moving towards the new DA (and the old DA won’t be updated after April 26th 2018, so the sooner the better). While there will be fluctuations as we improve the model and add features to the index, we expect it to remain largely stable and to be a far more accurate picture of a site’s authority according to how it’s seen by Google. Why is Link Explorer’s DA/PA considered better than OSE’s, and which should I trust? The larger link index with improved crawl selection allows us to produce a stronger model that includes a much larger proportion of the web. That being said, DA and PA should always be considered in the context of your competitors. A drop in PA or DA relative to the old OSE is of little concern if your competitors saw similar movement. Is Domain Authority/Page Authority an absolute score or a relative one? Both DA and PA are relative to the Internet as a whole. If Facebook acquired a billion new links, everyone’s PA and DA would drop relative to Facebook. Because of this, it’s always best to look at PA and DA in comparison to your competitors. What does a drop/raise in DA mean in Link Explorer vs OSE? How can I explain this to my clients when I’m reporting it? DA and PA should always be considered in the context of your competitors. A drop or raise in PA or DA relative to the old OSE is of little concern if your competitors saw similar movement. Reporting that your site has moved from a DA of 45 to a DA of 42 doesn’t tell the whole story, but reporting that your site has a DA of 42 while your main competitor moved from a 43 to a 37 shows that, relative to the sites you’re competing against in the SERPs, your site has significantly more authority and ranking power. What’s happening to MozTrust and MozRank and why, and what should I replace those with? The improvements to our DA/PA and Spam Score metrics now now account for more important nuances in helping you determine one site’s ability to rank higher than another. Because they no longer correlate with Google’s ranking model as well as they used to, MozRank and MozTrust are being deprecated for better metrics. Users should rely on Page Authority, Domain Authority, and Spam Score to determine the importance and quality of pages, domains, and links. I have historical data I use to help my clients benchmark their progress. What do I do now that DA is calculated differently? You should annotate any KPI changes referencing the change in DA and PA. However, most importantly, you should compare those changes to your competitors, as this will best show how strong your site’s authority is relative to the sites you’re competing against in the SERPs. We take updating our metrics very seriously, and our last major update to the model was 7 years ago. Users of Domain Authority and Page Authority can expect us to continue to produce steady, reliable metrics for the long haul, and only make changes to these metrics when we believe the benefits dramatically outweigh the stability of the metric. Do you have any questions about the new metrics? Anticipating a tough time reporting changes to clients or bosses? Metrics, features or functionality missing that you would want to see? Let us know in the thread, and we’ll work to find a good answer for you. Hope you enjoy the new Link Explorer product and the amazing new link index powering it. We are very excited to provide this valuable data to our community and customers.
Link Explorer | | IanWatson9 -
Does MOZ have a Flash test tool?
I want to test my websites and see if they use Flash, is there a flash check tool like on SEO tool kit here on MOZ? Thanks, Lance
Link Explorer | | BlueprintMM0 -
Learn how to use Open Site Explorer's Top Pages report to help inform your content marketing efforts. Get your Daily SEO Fix!
With the Top Pages report, you can see the pages on your site (and your competitors’) that are top performers. The pages are sorted by Page Authority - a prediction of how well a specific page will rank in search engines - and also metrics for linking root domains, inbound links, HTTP status and social shares. Be sure to watch today's Daily SEO Fix video tutorial to learn how to use Open Site Explorer's Top Pages report to analyze the competitions' content marketing efforts and to inform your own. This video is part of The Moz Daily SEO Fix tutorial series--Moz tool tips and tricks in under 2 minutes. To watch all of our videos so far, and to subscribe to future ones, make sure to visit the Daily SEO Fix channel on YouTube.
Link Explorer | | kellyjcoop3 -
Does using Sucuri block Moz?
I am using Opensite explorer on my site which uses Sucuri and it shows 0 for everything. Does Sucuri stop MOZ from reading the link? I also suspect that using Sucuri has made my SEO suffer because the first page is always saying "redirecting" . Anyone with experience to this? Thanks
Link Explorer | | seoprojecter0 -
Duplicated content detected with MOZ crawl with canonical applied
Hi there! I have a slight problem.
Link Explorer | | Eurasmus.com
I have a site with Joomla 3.3 that we recently migrated from 2.5. Joomla, for some reason that I don´t really get, creates hundreds of weird urls for the site like
mydomain.com/en -> joomla creates en/home/149-xxx-xxx/xxxxxx-xxxxxx that links to the first one.
The new version 3.3 knows this bug and applies a rel=canonical to the ones created "artificially", so they should not be identified as duplicated. Sample piece of code: en/home/149-all-en/xxxxxxx-xxxxxx" rel="canonical" / MOZ crawler identifies this as duplicated and like this I have thousands of pages duplicated all with titles, content etc... all the ones created by joomla. Still my site has good SEO results and I can not see any penalties but I am a bit concerned they may come in the future.... Can anyone explain me what is happening? Thank you in advance for your time,0 -
How Is a Page Crawled by Moz When Moz Says 'No Links'?
As above, really. I've crawled a new client's site to find the Moz crawler has identified a handful of 404 errors. The Moz crawler says these pages have '0 linking domains', and OSE has no data for these pages. So how are these pages being crawled by Moz and what should I advise my client?
Link Explorer | | xerox4320