Correct use for Robots.txt
-
I'm in the process of building a website and am experimenting with some new pages. I don't want search engines to begin crawling the site yet. I would like to add the Robot.txt on my pages that I don't want them to crawl. If I do this, can I remove it later and get them to crawl those pages?
-
Lewis,
Thank you for the clarification!
-
Hi Eric
The guidance above means that Google when it looks to crawl your site won't its not a message to Google telling it never to come back.
Once everything is sorted, remove whichever approach you took to block the search engines and supply a sitemap to Google via the Webmaster tools. Your site should be crawled in no time after that.
Hope this helps.
-
Damian,
Thanks for your answer, that helps. If I add either one of the above items to my web page, and then remove it at a later date, will the search engines crawl and rank my site (at sometime after they are removed)? In other words, and I know this sounds stupid, but does a search engine see a Robots.txt file and never visit it again?
-
Hey Eric,
If you want to create and work on pages but you don't want them indexed you can add the following to the page in the section (the pages will still be crawled):
If you want NONE of your pages to be crawled (I.E the whole website) you can add the following to your robots.txt file:
User-agent: * Disallow: /
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Block parent folder in robot.txt, but not children
Example: I want to block this URL (which shows up in Webmaster Tools as an error): http://www.siteurl.com/news/events-calendar/usa But not this: http://www.siteurl.com/news/events-calendar/usa/event-name
Web Design | | Zuken0 -
Bing Indexation and handling of X-ROBOTS tag or AngularJS
Hi MozCommunity, I have been tearing my hair out trying to figure out why BING wont index a test site we're running. We're in the midst of upgrading one of our sites from archaic technology and infrastructure to a fully responsive version.
Web Design | | AU-SEO
This new site is a fully AngularJS driven site. There's currently over 2 million pages and as we're developing the new site in the backend, we would like to test out the tech with Google and Bing. We're looking at a pre-render option to be able to create static HTML snapshots of the pages that we care about the most and will be available on the sitemap.xml.gz However, with 3 completely static HTML control pages established, where we had a page with no robots metatag on the page, one with the robots NOINDEX metatag in the head section and one with a dynamic header (X-ROBOTS meta) on a third page with the NOINDEX directive as well. We expected the one without the meta tag to at least get indexed along with the homepage of the test site. In addition to those 3 control pages, we had 3 pages where we had an internal search results page with the dynamic NOINDEX header. A listing page with no such header and the homepage with no such header. With Google, the correct indexation occured with only 3 pages being indexed, being the homepage, the listing page and the control page without the metatag. However, with BING, there's nothing. No page indexed at all. Not even the flat static HTML page without any robots directive. I have a valid sitemap.xml file and a robots.txt directive open to all engines across all pages yet, nothing. I used the fetch as Bingbot tool, the SEO analyzer Tool and the Preview Page Tool within Bing Webmaster Tools, and they all show a preview of the requested pages. Including the ones with the dynamic header asking it not to index those pages. I'm stumped. I don't know what to do next to understand if BING can accurately process dynamic headers or AngularJS content. Upon checking BWT, there's definitely been crawl activity since it marked against the XML sitemap as successful and put a 4 next to the number of crawled pages. Still no result when running a site: command though. Google responded perfectly and understood exactly which pages to index and crawl. Anyone else used dynamic headers or AngularJS that might be able to chime in perhaps with running similar tests? Thanks in advance for your assistance....0 -
Using a query string for linked, static landing pages - is this good practice?
My company has a page with links for each of our dozen office locations as well as a clickable map. These offices are also linked in the footer of every page along with their phone number. When one of these links is clicked, the visitor is directed to a static page with a picture of the office, contact information, a short description, and some other information. The URL for these pages is displayed as something like http:/example.com/offices.htm?office_id=123456, with seemingly random ID numbers at the end depending on the office that remain static. I know first off that this is probably bad SEO practice, as the URL should be something like htttp://example.com/offices/springfield/ My question is, why is there a question mark in the page URL? I understand that it represents a query string, but I'm not sure why it's there to begin with. A search query should not required if they are just static landing pages, correct?. Is there any reason at all why they would be queries? Is this an issue that needs to be addressed or does it have little to no impact on SEO?
Web Design | | BD690 -
Web Developer Using Stock Photos
Hello, The organization is selling a cms system in a niche market across the country. It has the normal SEO challenges, in addition he is using purchased stock images. This seemed ok, while he was smaller but now we are growing rapidly and these images are VERY STOCK- and well used ( I have checked with Tiny Eye). I remember a few years ago this was a flag to the search engines who went through manual review, is this still true? It seems to me that the theme's that come with the images, are duplicated ( including navigation & footers), so having the duplicated images would be another negative. Thank you for your suggestions!
Web Design | | TammyWood0 -
Using More Info javascript:toggleDisplay tag for More info text
Is there any harm in using javascript so a user can "toggle" open or closed additional text on a website? For example, if a user wants to read more about something, they can click on "More Info" and the text would then appear. Google is able to read the text, because I chose a random 8 word section of the text within the More Info and pasted it into a Google Search and the website showed up in search results. Just wondering if using this technique would have any negative impact. Here's what the code would look like:
Web Design | | EEE3
<a <span="">title</a><a <span="">="Show Tables" href="</a><a class=" " target="_blank">javascript:toggleDisplay('table1')</a>">More Info style="display: none;" id="table1"> this is where the text would be, and from this section was where I grabbed text to search with in google. Then in the footer, here is the script needed so the more info will work: I am by no means an expert in coding/html/javascript. Thanks!0 -
After a website redesign, what is the impact and is it a good practice to use /v2/ naming convention?
Hi mightyful SEOMoz community. We just launched a redesign of our commercial website from https://www.data-field.com to https://www.data-field.com/v2/ All URLs from previous website were 301 permanent redirect to the appropriate page in the new website, and the root domains ( /, /v2/ ) send the users to their own language content /v2/en/, /v2/fr/, /v2/zh/ Up to here everything is fine. But then I setup the usual "Share" buttons, only to find that they were displaying a "0" count. Then I realized that it was because of the root URL change from / to /v2/ My question is the following: 1. Is using /v2/ a good practice? 2. If yes, then should I link the Social tool to https://www.data-field.com/ ( only ) instead of linking it to the actual page in the address bar? Thanks for your answers.
Web Design | | NicolasE0 -
Are HTML sitemaps still in use today?
I'm trying to help a client understand the importance of having a well-organized HTML site map as a method of helping usability. As part of this process, I spent some time searching for good examples of well-organized HTML site maps, and found that many sites don't offer one (including SEOmoz). I'm wondering if webmasters and/or SEOers think they aren't valuable any longer?
Web Design | | EricVallee340 -
Is there any difference in using an underscore vs. a dash in the directory portion of the url?
A friend who is a software developer asked this question regarding the directory portion of the url: Is it better to use dashes or underscores? I know in the domain name Matt Cutts recommends dashes, but what about the directory extension?
Web Design | | RobertFisher0