I want to block search bots in crawling all my website's pages expect for homepage. Is this rule correct?
-
User-agent: *
Disallow: /*
-
-
Thanks Matt! I will surely test this one.
-
Thanks David! Will try this one.
-
Use this:
User-agent: Googlebot
Noindex: /User-agent: Googlebot
Disallow: /User-agent: *
Disallow: /This is what I use to block our dev sites from being indexed and we've had no issues.
-
Actually, there are two regex that Robots can handle - asterisk and $.
You should test this one. I think it will work (about 95% sure - tested in WMT quickly):
User-agent: *
Disallow: /
Allow: /$ -
I don't think that will work. Robots.txt doesn't handle regular expressions. You will have to explicitly list all of the folders, and files to be super sure, that nothing is indexed unless you want it to be found.
This is kind of an odd question. I haven't thought about something like this in a while. I usually want everything but a couple folders indexed. : ) I found something that may be a little more help. Try reading this.
If you're working with extensions, you can use **Disallow:/*.html$ **or php or what have you. That may get you closer to a solution.
Definitely test this with a crawler that obeys robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Spammy page with canonical reference to my website
A potentially spammy website http://www.rofof.com/ has included a rel canonical tag pointing to my website. They've included the tag on thousands of pages on their website. Furthermore http://www.rofof.com/ appears to have backlinks from thousands of other low-value domains For example www.kazamiza.com/vb/kazamiza242122/, along with thousands of other pages on thousands of other domains all link to pages on rofof.com, and the pages they link to on rofof.com are all canonicalized to a page on my site. If Google does respect the canonical tag on rofof.com and treats it as part of my website then the thousands of spammy links that point to rofof.com could be considered as pointing to my website. I'm trying to contact the owner of www.rofof.com hoping to have the canonical tag removed from their website. In the meantime, I've disavowed the www.rofof.com, the site that has canonical tag. Will that have any effect though? Will disavow eliminate the effect of a rel canonical tag on the disavowed domain or does it only affect links on the disavowed website? If it only affects links then should I attempt to disavow all the pages that link to rofof.com? Thanks for reading. I really appreciate any insight you folks can offer.
Intermediate & Advanced SEO | | brucepomeroy1 -
Can 'follow' rather than 'nofollow' links be damaging partner's SEO
Hey guys and happy Monday! We run a content rich website, 12+ years old, focused on travel in a specific region, and advertisers pay for banners/content etc alongside editorial. We have never used 'nofollow' website links as they're no explicitly paid for by clients, but a partner has asked us to make all links to them 'nofollow' as they have stated the way we currently link is damaging their SEO. Could this be true in any way? I'm only assuming it would adversely affect them if our website was peanalized by Google for 'selling links', which we're not. Perhaps they're just keen to follow best practice for fear of being seen to be buying links. FYI we now plan to change to more full use of 'nofollow', but I'm trying to work out what the client is refering to without seeming ill-informed on the subject! Thank you for any advice 🙂
Intermediate & Advanced SEO | | SEO_Jim0 -
Webpage has bombed outside of Top 50 for search term in one week. What's the cause?
I've been monitoring the performance of some pages via the email Moz sends every week, and until this week two pages that I've managed to get ranking have ranked between 20 and 23 for the specific term. However, today on the email one of the pages for one search term has bombed out of the top 50 while the other page has remained unaffected. What could be the cause for this? I've looked at Google Webmasters for an indication of a penalty of some sort but there is nothing glaringly obvious. I've no messages on there, and I haven't bought a load of spam links at all. What else could I check?
Intermediate & Advanced SEO | | mickburkesnr0 -
How do we better optimize a site to show the correct domain in organic search results for the location the user is searching in?
For example, chicago-company.com has the same content as springfield-company.com and I am searching for a general non-brand term (i.e. utility bill pay) and am located in Chicago. How can we optimize the chicago-company.com to ensure that chicago's site results are in top positions over springfields site?
Intermediate & Advanced SEO | | aelite1 -
Any idea why this page isn't indexing?
Hi Mozzers, Question for all of you. Any idea why this page isn't indexing in Google? It's indexing in Bing, but we don't see it in Google's results. It doesn't seem like we have any noindex tags or anyway issues with the robots files either. Any ideas? http://ohva.k12.com/
Intermediate & Advanced SEO | | petertong230 -
Will pages irrelevant to a site's core content dilute SEO value of core pages?
We have a website with around 40 product pages. We also have around 300 pages with individual ingredients used for the products and on top of that we have some 400 pages of individual retailers which stock the products. Ingredient pages have same basic short info about the ingredients and the retail pages just have the retailer name, adress and content details. Question is, should I add noindex to all the ingredient and or retailer pages so that the focus is entirely on the product pages? Thanks for you help!
Intermediate & Advanced SEO | | ArchMedia0 -
Soft 404's from pages blocked by robots.txt -- cause for concern?
We're seeing soft 404 errors appear in our google webmaster tools section on pages that are blocked by robots.txt (our search result pages). Should we be concerned? Is there anything we can do about this?
Intermediate & Advanced SEO | | nicole.healthline4 -
To index or not to index search pages - (Panda related)
Hi Mozzers I have a WordPress site with Relevanssi the search engine plugin, free version. Questions: Should I let Google index my site's SERPS? I am scared the page quality is to thin, and then Panda bear will get angry. This plugin (or my previous search engine plugin) created many of these "no-results" uris: /?s=no-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Akids+wall&cat=no-results&pg=6 I have added a robots.txt rule to disallow these pages and did a GWT URL removal request. But links to these pages are still being displayed in Google's SERPS under "repeat the search with the omitted results included" results. So will this affect me negatively or are these results harmless? What exactly is an omitted result? As I understand it is that Google found a link to a page they but can't display it because I block GoogleBot. Thanx in advance guys.
Intermediate & Advanced SEO | | ClassifiedsKing0