Robots.txt - Allow and Disallow. Can they be the same?
-
Hi All,
I need some help on the following:
Are the following commands the same?
User-agent: *
Disallow:
or
User-agent: *
Allow: /
I'm a bit confused. I take it that the first one allows all the bots but the second one blocks all the bots.
Is that correct?
Many thanks,
Aidan
-
Hi Aidan
I'm getting a similar problem on a site I'm working on. The on page rank checker "can't reach the page". I've checked everything obvious (at least I think I have!)
May I ask how you eventually resolved it?
Thanks Aidan
-
Hi
you can use this tool for be sure that the crawler see your files
http://pro.seomoz.org/tools/crawl-test
but you must wait for receive the report to a email.
when you say:
"get the following msg when I try to run On Page Analysis:"
the tools is this?
http://pro.seomoz.org/tools/on-page-keyword-optimization/new
for check the website you can use this:
http://www.opensiteexplorer.org
Ciao
Maurizio
-
Hi,
Thanks for the clarification. So the Robots.txt isn't blocking anything.
Do you know why then i cannot use SEOMoz On Page Analysis and Xenu and Screaming Frog only return 3 URLs?
I get the following msg when I try to run On Page Analysis:
"Oops! We were unable to reach the papge you requested for your report. Please try again later."
Would there be something else blocking me? GWMT Parameters maybe?
-
E' un piacere.
but I don't understand the problem.
if the site have this robots.txt
**User-agent: ***
Allow: /
every crawler can index and see all files of the this website and Seo moz also.
Maybe the problem is different?
Ciao
-
Thanks Maurizio,
I need to do some analysis on this site. Is there a way to use my SEO tools (screaming frog, SEOMoz) to ignore the robots.txt to enable me to do a good site audit?
Thanks again for the answers. Much appreciated
Aidan
-
Hi Aidan
User-agent: *
Disallow:and
User-agent: *
Allow: /are the same
Ciao
Maurizio -
Hi Maurizio,
The reason I asked is because I am working on a site and it's robots.txt is :
User-agent: *
Allow: /
Why would they have this?
I can't use On-Page Analysis or Screaming Frog as it only results in 3 URLs.
Thanks again,
Aidan
-
Hi
1° example:
User-agent: *
Disallow:all User-agent can index your files
2° example
User-agent: *
Disallow: /never User-agent"can index you files
other example here:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449
Ciao
Maurizio
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Https pages indexed but all web pages are http - please can you offer some help?
Dear Moz Community, Please could you see what you think and offer some definite steps or advice.. I contacted the host provider and his initial thought was that WordPress was causing the https problem ?: eg when an https version of a page is called, things like videos and media don't always show up. A SSL certificate that is attached to a website, can allow pages to load over https. The host said that there is no active configured SSL it's just waiting as part of the hosting package just in case, but I found that the SSL certificate is still showing up during a crawl.It's important to eliminate the https problem before external backlinks link to any of the unwanted https pages that are currently indexed. Luckily I haven't started any intense backlinking work yet, and any links I have posted in search land have all been http version.I checked a few more url's to see if it’s necessary to create a permanent redirect from https to http. For example, I tried requesting domain.co.uk using the https:// and the https:// page loaded instead of redirecting automatically to http prefix version. I know that if I am automatically redirected to the http:// version of the page, then that is the way it should be. Search engines and visitors will stay on the http version of the site and not get lost anywhere in https. This also helps to eliminate duplicate content and to preserve link juice. What are your thoughts regarding that?As I understand it, most server configurations should redirect by default when https isn’t configured, and from my experience I’ve seen cases where pages requested via https return the default server page, a 404 error, or duplicate content. So I'm confused as to where to take this.One suggestion would be to disable all https since there is no need to have any traces to SSL when the site is even crawled ?. I don't want to enable https in the htaccess only to then create a https to http rewrite rule; https shouldn't even be a crawlable function of the site at all.RewriteEngine OnRewriteCond %{HTTPS} offor to disable the SSL completely for now until it becomes a necessity for the website.I would really welcome your thoughts as I'm really stuck as to what to do for the best, short term and long term.Kind Regards
Web Design | | SEOguy10 -
Can anyone recommend a tool that will identify unused and duplicate CSS across an entire site?
Hi all, So far I have found this one: http://unused-css.com/ It looks like it identifies unused, but perhaps not duplicates? It also has a 5,000 page limit and our site is 8,000+ pages....so we really need something that can handle a site larger than their limit. I do have Screaming Frog. Is there a way to use Screaming Frog to locate unused and duplicate CSS? Any recommendations and/or tips would be great. I am also aware of the Firefix extensions, but to my knowledge they will only do one page at a time? Thanks!
Web Design | | danatanseo0 -
Can someone help me understand Structured Data?
So I'm wondering if someone could explain Structured Data a little better to me and what the importance is. I also am wondering how to best add Scheme.org markup to certain pages. I tried a plugin for wordpress and I don't think it was working correctly. I'm specifically wanting to make sure my Google Profile is showing with my website in SERP. I have the ?rel=author tag in on the front page and when I checked it when the Google Structured Data checker it shows it to be correct but its not displaying in SERP. Thanks!
Web Design | | jonnyholt0 -
Can someone suggest a good wordpress plugin for compressing images?
I am building a new website and it has 500+ photo images. I've got less than 100 in there at the moment and the page speed is already very slow. I'm trying to reduce the size of the images before uploading but without a photoshop application it is very difficult. Are there any great plugins for wordpress that can help?
Web Design | | skehoe0 -
Can't figure out what's going on with these strange 403's
The last crawl found a good number of 404's and I can't figure out what's going on. I don't even recognize these links. What's really strange is a few of them high a fairly decent page authority. For instance this one has a PA of 55: http://noahsdad.com/?path=http%3A%2F%2Feurosystems.it%2Fconf_commerciale%2Fimages%2Fdd.gif%3F%3F There are several more like this one also, it seems most of these new ones have "Feurosystems" in the link...I have no idea what that is. Just curious what you guys think is going on, why these are 404'ing, and how to fix it. Thanks. Edit: I took out the "%" from the links and I get this: http://noahsdad.com/?path=http3A2F2Feurosystems.it2Fconf_commerciale2Fimages2Fdd.gif3F3F which takes me to a page on my site. I have no idea what's going on, or what that link is. Hoping someone can chime in because this is strange. Another edit: I just checked out the Google Webmaster's and it looks like these errors are 403's and all started around March 21st. I have no idea what happend on March 21st to start causing all of these errors, but I'd sure like to get it fixed. 🙂
Web Design | | NoahsDad0 -
How can the Web site designer and the SEO strategist work together peacefully?
The organization I work for has decided to re-design or re-develop the existing company Web site. My part in this project is to come up with new features to add to the site, as well as making the site SEO-friendly (copywriting, link-building, keyword research, etc.). I don’t know a thing about Web site design, coding, format, etc., and I guess I will have to work with a designer on this project. How would I go about finding a Web site designer? Should they have some SEO knowledge? How much designer, coding and site structure knowledge should I have? And how do we not infringe on one another as we work together? (Sorry so many questions.)
Web Design | | Obie0 -
Can i do this? Will Google penalize me?
I have a page for a Criminal Defense Attorney and i set up a list of the type of criminal charges he is certified to deal with. I wanted to use title tags and put the Keyword "Miami Criminal Defense Attorney" & "Miami Traffic Defense Lawyer"... My question is will Google penalize me for plugging the same Key words over and over on the title tag for each ?? CHECK THE IMAGE to see what I'm talking about... thanks guys. x97dl
Web Design | | marig0 -
Correct use for Robots.txt
I'm in the process of building a website and am experimenting with some new pages. I don't want search engines to begin crawling the site yet. I would like to add the Robot.txt on my pages that I don't want them to crawl. If I do this, can I remove it later and get them to crawl those pages?
Web Design | | EricVallee340