How do I disallow crawl on a directory when it's a prefix to my site's URL?
-
I am trying to disallow our media repository (hosted elsewhere, but appears as a directory on our site) from being crawled by robots but it is not a subdirectory of the site, it's a prefix.
So I need to disallow: mediabank.mywebsite.org
Not: mysite.org/mediabank
What would I need to put in my robots.txt and/or the other host's robots.txt to make this happen?
Thanks!
-
Hey there! Tawny from Moz's Help Team here.
You'll want to add a robots.txt file for that subdomain, and then add a Disallow command to that robots.txt file. So, using your example, you'd want a file like mediabank.mywebsite.org/robots.txt that had a Disallow command for any robots you don't want crawling that subdomain.
For all user-agents, that would look something like this:
User-agent: *
Disallow: /That would stop any user-agents from crawling any pages on that subdomain.
I hope this helps! If you've still got questions, feel free to send us a note at help@moz.com and we'll do our best to sort things out for you.
-
Hi,
Please check this old thread on the same topic @ https://moz.com/community/q/block-an-entire-subdomain-with-robots-txt
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I use json-ld in our site but schema not found in markup tool bar moz
i use json-ld in our site. structure data in test without any error but in test markup moz in our site schema is not found! Is it time for Google to identify the code to verify the schema?
Moz Bar | | shokoufezand0 -
In Moz Campaigns, how are competitor domains tracked if they redirect their site?
Hello! One of our competitors (Company A) that we've tracked in Moz for a long time recently merged with another company (Company B) and redirected their whole site to Company B's site. Will our competitor tracking still work as-is? Or do we need to make an adjustment? I'm reluctant to delete Company A from our competitor tracking, because we will lose all of that data. But if all of the keywords are slowly going to drop off as Google starts showing Company B results only, it may be the only option. Any help is appreciated! Thanks!
Moz Bar | | PrimeFoodTeam0 -
Moz says "Title Too Long", Yoast says title is the perfect length. Who's right?
For a bunch of my pages, the MOZ Crawl Report says "Title Too Long". Yoast on my site tells me that the titles are the correct length. How can these two things be at odds with each other? Which one is right?
Moz Bar | | TeamViviRealEstate0 -
Moz is only crawling 2 pages
Hi, I found a similar thread, but it did not provide a clear-cut answer. We have had this campaign running for over a year, and we are always adding content to the website, but Moz is only ever able to crawl 2 pages, Screaming Frog only picks up 12, but I know there is a lot more than that. None of our pages are set to no-index, so I do not know what is causing this. Welcoming any ideas/solutions. Thanks
Moz Bar | | GavinAdv0 -
On-Page Grader Url is inaccessible
Hi everybody. I'm trying to use on -page grader for https://www.upscaledinnerclub.com and get "Sorry, but that URL is inaccessible." Robots.txt are empty, another thread on MOZ was talking about DNS check - it's all good. So, I can't figure out why this is happening. Also I am trying the same for another website https://www.regexseo.com - the same story. Common thing is that they both are on Google App Engine. And at first i thought that was the problem. Bu then i checked this one : https://www.logitinc.com/ and it's working, even though this website is on GAE as well. None of these website have robots.txt or any differences in setup or settings. Any thoughts?
Moz Bar | | DmitriiK0 -
External links that don't exist
I did a crawl test and it shows that there are 2215 external links on my home page here: https://www.playshakespeare.com An "External Link" is defined here: https://moz.com/researchtools/crawl-test "Number of links on a page that point to a page that is not on the same domain as that page." There are NOT 2215 external links on that page. There are only a handful at any given time. Why is the crawl test reporting thousands over that?
Moz Bar | | playshakespeare0 -
Getting 'Sorry, but that URL is inaccessible' error msg when trying to run On-Page Grader
I just signed up for MOZ Pro for the first time today. Tried to run the 'on-page grader' tool on some of my pages but I'm getting a 'Sorry, but that URL is inaccessible' error msg. I have verified against the robot.txt file that the pages are NOT blocking any crawlers. Can anybody help?
Moz Bar | | spinoki0 -
Why'd Moz stop showing the list of users?
Curious to know if anyone else noticed that Moz stopped showing most of the active community users http://moz.com/community/users. It was nice to see who's who from visiting profiles and try to connect with them via email or see their websites, etc. There used to be pagination at the bottom. Why did they stop?
Moz Bar | | WhiteboardCreations0