Blocking pages from Moz and Alexa robots
-
Hello,
We want to block all pages in this directory from Moz and Alexa robots - /slabinventory/search/
Here is an example page - https://www.msisurfaces.com/slabinventory/search/granite/giallo-fiesta/los-angeles-slabs/msi/
Let me know if this is a valid disallow for what I'm trying to.
User-agent: ia_archiver
Disallow: /slabinventory/search/*User-agent: rogerbot
Disallow: /slabinventory/search/*Thanks.
-
Hi,
Firstly, yes, that robots.txt is valid and would work for your purpose.
There's a great tool (https://technicalseo.com/tools/robots-txt/) that allows you to put in your proposed robots.txt file contents, the URL you want to test and even the robot you want to test against and it lets you know the result.
-
That looks valid to me. It's possible you may not need "*" at the end of each rule but I can't see it doing any harm either
I might go more like:
User-agent: ia_archiver
Disallow: /*/search/User-agent: rogerbot
Disallow: /*/search/^ this would stop all search URLs being indexed, so even if you introduced new search facilities later in other directories - they would 'probably' be caught too (assuming that is your intention, assuming they were still in /search/ subdirs)
Don't think what you have done is wrong though.
Always check using Google's robots.txt tester to be safe. Just put your rules into the tester (altering them to be used for all user-agents), and try out some different URL patterns. When it works as you like, update your real robots.txt file (remembering of course, to restore your rogerbot / alexa UA targeting - if you don't want the rules to also apply to Google!)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Over 40+ pages have been removed from the indexed and this page has been selected as the google preferred canonical.
Over 40+ pages have been removed from the indexed and this page has been selected as the google preferred canonical. https://studyplaces.com/about-us/ The pages affected by this include: https://studyplaces.com/50-best-college-party-songs-of-all-time-and-why-we-love-them/ https://studyplaces.com/15-best-minors-for-business-majors/ As you can see the content on these pages is totally unrelated to the content on the about-us page. Any ideas why this is happening and how to resolve.
Technical SEO | | pnoddy0 -
Adding directories to robots nofollow cause pages to have Blocked Resources
In order to eliminate duplicate/missing title tag errors for a directory (and sub-directories) under www that contain our third-party chat scripts, I added the parent directory to the robots disallow list. We are now receiving a blocked resource error (in Webmaster Tools) on all of the pages that have a link to a javascript (for live chat) in the parent directory. My host is suggesting that the warning is only a notice and we can leave things as is without worrying about the page being de-ranked/penalized. I am wondering if this is true or if we should remove the one directory that contains the js from the robots file and find another way to resolve the duplicate title tags?
Technical SEO | | miamiman1000 -
Block Domain in robots.txt
Hi. We had some URLs that were indexed in Google from a www1-subdomain. We have now disabled the URLs (returning a 404 - for other reasons we cannot do a redirect from www1 to www) and blocked via robots.txt. But the amount of indexed pages keeps increasing (for 2 weeks now). Unfortunately, I cannot install Webmaster Tools for this subdomain to tell Google to back off... Any ideas why this could be and whether it's normal? I can send you more domain infos by personal message if you want to have a look at it.
Technical SEO | | zeepartner0 -
Alexa Ranking - Improvement
Hey Guys, I have very decent traffic for my website but the Alexa rank is fluctuating very frequently.If the traffic is growing I cant see any changes and if the traffic dips down a bit then my rankings are going high 😞 Please suggest me what to do for getting an Alexa rank of **7000. **At present I have 9859 Hers ie my website link : http://www.teluguone.com/ Thanks in Advance Saikiran
Technical SEO | | logobite0 -
Handling 301s: Multiple pages to a single page (consolidation)
Been scouring the interwebs and haven't found much information on redirecting two serparate pages to a single new page. Here is what it boils down to: Let's say a website has two pages, both with good page authority of products that are becoming fazed out. The products, Widget A and Widget B, are still popular search terms, but they are being combined into ONE product, Widget C. While Widget A and Widget B STILL have plenty to do with Widget C, Widget C is now the new page, the main focus page, and the page you want everyone to see and Google to recognize. Now, do I 301 Widget A and Widget B pages to Widget C, ALTHOUGH Widgets A and B previously had nothing to do with one another? (Remember, we want to try and keep some of that authority the two page have had.) OR do we keep Widget A and Widget B pages "alive", take them off the main navigation, and then put a "disclaimer" on the pages announcing they are now part of Widget C and link to Widget C? OR Should Widgets A and B page be canonicalized to Widget C? Again, keep in mind, widgets A and B previously were not similar, but NOW they are and result in Widget C. (If you are confused, we can provide a REAL work example of what we are talkinga about, but decided to not be specific to our industry for this.) Appreciate any and all thoughts on this.
Technical SEO | | JU19850 -
Page Over-optimized?
I read over this post on the blog tonight: http://www.seomoz.org/blog/lessons-learned-by-an-over-optimizer-14730 & it's got me concerned that I might be having a similar issue on our site? Back in March & April of last year, we ranked fairly well for a number of long tail keywords, here is one in particular 'Mio Drink' for this page: http://www.discountqueens.com/free-mio-drink-from-kraft-facebook-offer The page is still indexed, but appears back on page #3 for the search term. During this time we had made a number of different updates to our site & I can't seem to put an exact finger on what might have caused the problem? Can anyone see any issues that might have caused this to drop? Thanks, BJ
Technical SEO | | seointern0 -
Should search pages be disallowed in robots.txt?
The SEOmoz crawler picks up "search" pages on a site as having duplicate page titles, which of course they do. Does that mean I should put a "Disallow: /search" tag in my robots.txt? When I put the URL's into Google, they aren't coming up in any SERPS, so I would assume everything's ok. I try to abide by the SEOmoz crawl errors as much as possible, that's why I'm asking. Any thoughts would be helpful. Thanks!
Technical SEO | | MichaelWeisbaum0 -
Robots.txt Syntax
Does the order of the robots.txt syntax matter in SEO? For example (are there potential problems with this format): User-agent: * Sitemap: Disallow: /form.htm Allow: / Disallow: /cgnet_directory
Technical SEO | | RodrigoStockebrand0