Block parent folder in robot.txt, but not children
-
Example:
I want to block this URL (which shows up in Webmaster Tools as an error):
http://www.siteurl.com/news/events-calendar/usa
But not this:
-
The idea from Andrew is nice, but my guess would be that you're targeting multiple events so that might run into issues. What you could do is add some more regular expression and make it like this:
Disallow: ^/news/events-calendar/usa$
-
You could use "allow" in your robots.txt file for just this problem.
allow: news/events-calendar/usa/event-name
disallow: /news/events-calendar/usa
See the allow directive section of this page: https://en.wikipedia.org/wiki/Robots_exclusion_standard
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best Location for Copy Block
We are having discussions around the appropriate location to place the SEO copy block on an eCommerce category page. Would like to get the communities opinion to share with the creative team.
Web Design | | TukTown0 -
Will There Be Much Impact When Moving Site To New Root Folder?
Hi, ok so I have a pretty big site that is located on my sever /root/current-folder/. I want to rebuild the site completely as it's using software that is out of date and not our main focus anymore (OpenCart). We want to move to a Wordpress platform, but want to have as little impact on the SEO as possible. Our current strategy is: List all URLs/Titles/Meta indexed with Google on current site Create new folder on the server /root/new-folder/ My question is... if I move to a new folder on the server (same TLD) and then re-route the TLD to go to this new folder, will there be more of an impact on SEO that if I start a fresh in the current folder? Thanks
Web Design | | Easigrass0 -
Bing Indexation and handling of X-ROBOTS tag or AngularJS
Hi MozCommunity, I have been tearing my hair out trying to figure out why BING wont index a test site we're running. We're in the midst of upgrading one of our sites from archaic technology and infrastructure to a fully responsive version.
Web Design | | AU-SEO
This new site is a fully AngularJS driven site. There's currently over 2 million pages and as we're developing the new site in the backend, we would like to test out the tech with Google and Bing. We're looking at a pre-render option to be able to create static HTML snapshots of the pages that we care about the most and will be available on the sitemap.xml.gz However, with 3 completely static HTML control pages established, where we had a page with no robots metatag on the page, one with the robots NOINDEX metatag in the head section and one with a dynamic header (X-ROBOTS meta) on a third page with the NOINDEX directive as well. We expected the one without the meta tag to at least get indexed along with the homepage of the test site. In addition to those 3 control pages, we had 3 pages where we had an internal search results page with the dynamic NOINDEX header. A listing page with no such header and the homepage with no such header. With Google, the correct indexation occured with only 3 pages being indexed, being the homepage, the listing page and the control page without the metatag. However, with BING, there's nothing. No page indexed at all. Not even the flat static HTML page without any robots directive. I have a valid sitemap.xml file and a robots.txt directive open to all engines across all pages yet, nothing. I used the fetch as Bingbot tool, the SEO analyzer Tool and the Preview Page Tool within Bing Webmaster Tools, and they all show a preview of the requested pages. Including the ones with the dynamic header asking it not to index those pages. I'm stumped. I don't know what to do next to understand if BING can accurately process dynamic headers or AngularJS content. Upon checking BWT, there's definitely been crawl activity since it marked against the XML sitemap as successful and put a 4 next to the number of crawled pages. Still no result when running a site: command though. Google responded perfectly and understood exactly which pages to index and crawl. Anyone else used dynamic headers or AngularJS that might be able to chime in perhaps with running similar tests? Thanks in advance for your assistance....0 -
How to command Robots.txt to this:
Hi, So for some reason I have this unexplained issues in webmaster tools. Check them out: http://prntscr.com/7n1nj8 See that iSeeCars.com? How to remove it? Is it just disallow: iseecars.com? Or should I disallow the search to be crawled? Regards,
Web Design | | Kokolo0 -
Wordpress Theme is blocking alt tags. Does anybody know of any special plugins?
We have a special wordpress theme for nataliecass.com. Unfortunately the theme is blocking all the alt tags (this is a photography website...alt tags are very important). Does anybody know of any special WP plugins for alt tags? Thanks
Web Design | | VanguardCommunications0 -
Is anyone using Humans.txt in your websites? What do you think?
http://humanstxt.org Anyone using this on their websites and if so have you seen and positive benefits of doing so? Would be good to see some examples of sites using it and potentially how you're using the files. I'm considering adding this to my checklist for launching sites
Web Design | | eseyo1 -
How to fix and issue with robot.txt ?
I am receiving the following error message through webmaster tools http://www.sourcemarketingdirect.com/: Googlebot can't access your site Oct 26, 2012
Web Design | | skehoe
Over the last 24 hours, Googlebot encountered 35 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. The site has dropped out of Google search.0 -
Search directory - How to apply robots
Hi. On the site I'm working on, we use a search directory to display our search results. It displays as follows - Mydomain.com/search-results/# With the dynamic search results appearing after the hash tag. Because of the structure of the website, many of the lefthand nav defers back to this directory. I know that most websites "noindex, nofollow" the search results pages, but due to the ease of customers generating them, I'm afraid that if I do this, we'll miss out on the inevitable links customers will provide...and, even though it's just the main search directory, these links will still help my domain. The search is all java-generated so there's nothing for spiders to follow within this directory - save the standard category nav. How should I handle this? Thanks.
Web Design | | Blenny0