Robots.txt questions...
-
All,
My site is rather complicated, but I will try to break down my question as simply as possible.
I have a robots.txt document in the root level of my site to disallow robot access to /_system/, my CMS. This looks like this:
# /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism**User-agent: ***
Disallow: /_system/I have another robots.txt file in another level down, which is my holiday database - www.mysite.com/holiday-database/ - this is to disallow access to /holiday-database/ControlPanel/, my database CMS. This looks like this:
**User-agent: ***
Disallow: /ControlPanel/Am I correct in thinking that this file must also be in the root level, and not in the /holiday-database/ level? If so, should my new robots.txt file look like this:
# /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism**User-agent: ***
Disallow: /_system/
Disallow: /holiday-database/ControlPanel/Or, like this:
# /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism**User-agent: ***
Disallow: /_system/
Disallow: /ControlPanel/Thanks in advance.
Matt
-
Good answer Yannick.
here are some resources:
http://www.free-seo-news.com/all-about-robots-txt.htm
http://www.robotstxt.org/robotstxt.html
Good luck
-
Cheers gents.
-
Like:
# /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism**User-agent: ***
Disallow: /_system/
Disallow: /holiday-database/ControlPanel/Search engines typically only look in the root of your domain to find robots.txt and sitemap.xml files.
-
Hey Matt
The first of your options looks right and google and other engines look for the robots.txt file in the site root rather than for each directory.
If you had a reason for not wanting that info in the root robots.txt file you can always use the robots meta tag on the pages in a given directory.
Few useful links:
Robots.txt
http://www.google.com/support/webmasters/bin/answer.py?answer=156449&&hl=enRobots Meta Tag
http://www.google.com/support/webmasters/bin/answer.py?answer=93710Marcus
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
PDF Optimization Question: Does URL Structure Matter?
Hi Mozzers: I am optimizing a bunch of PDF brochures within a client's website. Besides the typical optimization tactics I'm applying, (like these) I have a question regarding the file/url structure of the PDFs themselves. By default, the client is locating PDFs in an 'uploads' folder of their Wordpress site. So, a typical PDF might have a URL such as: https://www.Xyzinsurance.com/xyz-content/uploads/2015/06/Brochure-XYZ-Connect.pdf My question: is there any advantage in eliminating all these sub-directories and moving the files into a main folder, simply titled '/brochures' ?? Any insights or conjecture would be welcome!
Technical SEO | | Daaveey0 -
Questions About The Right Hosting
Hi All, I have a few questions about the right type of hosting that I should be using. I understand that many people say we should be using the best hosting that we can afford. However, when I have a website with just 650 pages / posts is it really worth worrying too much about where I am hosting. I am UK based so at the moment I am using a UK host along with a CDN. I have a unique IP address and on a server that has a limited amount of websites on it. The main question is there really any need to be looking at anything else. The truth is I have used cloud hosting before and the website loaded slower around the world with that than it does with my current setup. Thanks
Technical SEO | | TTGUK0 -
Need very urgent advice on Wedsite Migration questions please.
Good afternoon, I am in the middle of re branding my Computer Repair Business that has been established for the past 5 years now and does very well in GOOGLE for all things computer repair search related queries. I have a done tons of research on the Migration process and all elements involved for the past month now and I have been doing SEO for the past 5 years and am quite knowledgeable and very fluent with search engine optimization. I have a bit of a unique situation going on in my particular instance. Not only am I re branding the business and trying to maintain my GOOGLE rankings trying to pull this off at the busiest time of the year (summer months) but I am also going to be physically relocating my business after the busy (Summer Season) from my current location of Wilmington, NC to Charlotte, NC. Yes, I do understand that Winter would have been the best (time) to do all of this massive changing around but, my hand is forced and this is being done out of necessity for monetary survival purposes and I really have no choice in the matter currently! With that said here is my dilemma. I am setting up a SILO Site Architecture using the "Thesis Framework" and WordPress which will be an improvement by 5 fold of what I have in my current well established JOOMLA site that has done very well in GOOGLE Search these past 5 years (all first page results) for anything (Computer Repair) related in Wilmington, NC. So, I am trying run damage control on this situation and I have a number of questions that need a strategic well thought out answer from a different perspective on this matter. 301 Redirect question: I want to include my geo targeted location in my URL String as it has served me well in the pass however, seeing how I am going to be Physically relocating myself and the business to Charlotte, NC from my current location of Wilmington, NC after the summer months what will I need to take into consideration with this situation in regards to the 301 redirect? _ "If I include my current location (Wilmington NC) in the destination (New Domain) URL string for any given 301 redirect from my existing website to the new website and then physically move to another city 3 months later is this setting myself up for a BIG Failure??"_ Seeing how I have no idea of how this technically works with GOOGLE as far as how long this (migration process) takes to Fully complete where the OLD domain completely drops off and everything is Fully passed over to the New Domain in terms leaving the 301 Re directs in place on the Old Domain Server. How long does this process usually take with GOOGLE? Information that you should know: This is my First Experience doing a Site Migration! The NEW DOMAIN is on a Different Server and IP I will be performing 301 Redirects on a Page to Page Basis! I DO plan on keeping the Old Domain Server Account online for 4 months after the migration process. Both OLD and NEW domains are on HOST GATOR (separate accounts) I am Migrating from JOOMLA to WordPress I am using the Thesis Framework for the new WordPress site (New Domain) I have created and established a well thought out SILO Site Architecture for the (NEW DOMAIN) using Parent and Child Pages in WordPress (not posts) supported by many hours of keyword research for my SILO Themes I am changing My Existing /computer-repair.hml URL Structure to WordPress **/computer-repair/ ** remember.. (using WP - Pages) not posts! I am re branding my Company business from Community Computer Repair to PC Medics On Call I am reducing my on page Content because it is too long in the Tooth for my customer base and although the search engines love it, in it's current state (long winded and well written) it is having a negative effect on the people that actually pay my bills (my customers) but my new Site Hierarchical Structure will rectify a lot of the negative fall out from this change that would otherwise kill me in the search rankings from doing this. In addition to being an proficient SEO I am also a developer and a coder. The New Domain is Currently ALLOWING only my IP Address while I sort this out and until I complete the new site structure and get the content and ON Page done. For the Destination Domain URL Structure, is including my current City (Wilmington NC) going to be an issue with the 301 Redirects seeing how I am moving to Charlotte NC around September? Having the Geo Targeted City in my URL Structure will help offset some of the damage caused by the changes that I need to make. Plus, the URL looks less spammy looking at the examples below when icluding location after the keyword phrase **Example -1 - ** VIRUS REMOVAL SILO With GEO Targeted Location after Keyword http://www.pcmedicsoncall.com/virus-removal-wilmington-nc/malware-removal/ Without - GEO Targeted Location after Keyword http://www.pcmedicsoncall.com/virus-removal/malware-removal ** Example -2 -** COMPUTER REPAIR SILO With GEO Targeted Location after Keyword http://www.pcmedicsoncall.com/computer-repair-wilmington-nc/laptop-repair/ Without - GEO Targeted Location after Keyword http://www.pcmedicsoncall.com/computer-repair/laptop-repair/ As far as the potential problems with icluding the Wilmington NC in the Targeted URL destination for the 301 Redirect and then changing my Actual City location for the business 2 1/2 - 3 months later is this going to be an issue? Pertaining to the 301 redirect question above, how exactly would I handle this if I used Wilmington, NC after the Keyword when I initiate the 301 Redirect to the site? Can I later change this in the OLD Domain .htaccess 3 months later to reflect the Charlotte Location and also do the necessary 301 Redirect in the NEW DOMAIN to reflect the permanent move? I hope I have not confused this any by the way that I am asking? Here is a screen shot of the NEW DOMAIN layout for the Links below and above. NOTE: Currently, ALL Silo Menu Themes have - wilmington-nc/ incorporated after the targeted Keyword http://www.pcmedicsoncall.com/virus-removal-wilmington-nc/malware-removal/ http://www.pcmedicsoncall.com/virus-removal/malware-removal SEOMOZ-PC-MEDICS-ON-CALL-1.jpg SEOMOZ-PC-MEDICS-ON-CALL1.jpg
Technical SEO | | MarshallThompson310 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Timely use of robots.txt and meta noindex
Hi, I have been checking every possible resources for content removal, but I am still unsure on how to remove already indexed contents. When I use robots.txt alone, the urls will remain in the index, however no crawling budget is wasted on them, But still, e.g having 100,000+ completely identical login pages within the omitted results, might not mean anything good. When I use meta noindex alone, I keep my index clean, but also keep Googlebot busy with indexing these no-value pages. When I use robots.txt and meta noindex together for existing content, then I suggest Google, that please ignore my content, but at the same time, I restrict him from crawling the noindex tag. Robots.txt and url removal together still not a good solution, as I have failed to remove directories this way. It seems, that only exact urls could be removed like this. I need a clear solution, which solves both issues (index and crawling). What I try to do now, is the following: I remove these directories (one at a time to test the theory) from the robots.txt file, and at the same time, I add the meta noindex tag to all these pages within the directory. The indexed pages should start decreasing (while useless page crawling increasing), and once the number of these indexed pages are low or none, then I would put the directory back to robots.txt and keep the noindex on all of the pages within this directory. Can this work the way I imagine, or do you have a better way of doing so? Thank you in advance for all your help.
Technical SEO | | Dilbak0 -
Title Length Question?
So we have a lot of UGC on our site and so the title of pages is often created by the user and this has created about 400 pages with over 70 characters and I was just wondering what people think. I know typically keeping them short and sweet is the best thing, but what about when it's the user doing it? Should I go ahead and cut off the titles at 70 characters or keep them? I don't see it hurting traffic so I'm basically just looking for opinions right now.
Technical SEO | | KateGMaker0 -
Using Robots.txt
I want to Block or prevent pages being accessed or indexed by googlebot. Please tell me if googlebot will NOT Access any URL that begins with my domain name, followed by a question mark,followed by any string by using Robots.txt below. Sample URL http://mydomain.com/?example User-agent: Googlebot Disallow: /?
Technical SEO | | semer0 -
What to do about "blocked by meta-robots"?
The crawl report tells me "Notices are interesting facts about your pages we found while crawling". One of these interesting facts is that my blog archives are "blocked by meta robots". Articles are not blocked, just the archives. What is a "meta" robot? I think its just normal (since the article need only be crawled once) but want a second opinion. Should I care about this?
Technical SEO | | GPN0