Robots.txt file
-
How do i get Google to stop indexing my old pages and start indexing my new pages even months down the line?
Do i need to install a Robots.txt file on each page?
-
What CMS are you using? If it is wordpress, there is a plugin that allows you to configure when google is asked to crawl your site and you can set it up so that they only crawl your "archives" monthly and set up how often you have new content on the site. it is called WP Robots Txt
-
there should be only one robots.txt file for the entire site. Do you mean "Do I need to add a separate line item within the robots.txt file that disallows each page I want deindexed?" ? The answer is that would not be the best solution.
Other questions need to be asked as well.
1. Have you created a sitemap.xml file with only the new pages in it? And if so, have you submitted that to Google through Webmaster Tools?
2. Have you obtained any off-site links pointing to the new pages?
3. Have you performed a site:mydomain.com search in Google to see whether those new pages have been indexed?
4. Have you checked Google webmaster tools to see whether there are serious errors on your site their system has come across that might be related?
5. Does your site's robots.txt file currently block those new pages from being crawled?
All of these need to be answered to help determine a course of action.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using one robots.txt for two websites
I have two websites that are hosted in the same CMS. Rather than having two separate robots.txt files (one for each domain), my web agency has created one which lists the sitemaps for both websites, like this: User-agent: * Disallow: Sitemap: https://www.siteA.org/sitemap Sitemap: https://www.siteB.com/sitemap Is this ok? I thought you needed one robots.txt per website which provides the URL for the sitemap. Will having both sitemap URLs listed in one robots.txt confuse the search engines?
Technical SEO | | ciehmoz0 -
CSS and Javascipt files - website redesign project
UPDATED: We ran a crawl of the old website and have a list of css and javascript links as are part of the old website content. As the website is redesigned from scratch, I don't think these old css and javascipt files are being used for anything on the new site. I've read elsewhere online that you redirect "all" content files if launching/migrating to a new site. We are debating if this is needed for css and javascript files. Examples (A) http://website.com/wp-content/themes/style.css (B) http://website.com/wp-includes/js/wp-embed.min.js?ver=4.8.1
Technical SEO | | CREW-MARKETING0 -
Robots.txt - "File does not appear to be valid"
Good afternoon Mozzers! I've got a weird problem with one of the sites I'm dealing with. For some reason, one of the developers changed the robots.txt file to disavow every site on the page - not a wise move! To rectify this, we uploaded the new robots.txt file to the domain's root as per Webmaster Tool's instructions. The live file is: User-agent: * (http://www.savistobathrooms.co.uk/robots.txt) I've submitted the new file in Webmaster Tools and it's pulling it through correctly in the editor. However, Webmaster Tools is not happy with it, for some reason. I've attached an image of the error. Does anyone have any ideas? I'm managing another site with the exact same robots.txt file and there are no issues. Cheers, Lewis FNcK2YQ
Technical SEO | | PeaSoupDigital0 -
Creating a CSV file for uploading 301 redirect URL map
Hi if i'm bulk uploading 301 redirects whats needed to create a csv file? is it just a case of creating an excel spreadsheet & have the old urls in column A and new urls in column B and then just convert to csv and upload ? or do i need to put in other details or paremeters etc etc ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Robots.txt checker
Google seems to have discontinued their robots.txt checker. Is there another tool that I can use to check my text instead? Thanks!
Technical SEO | | theLotter0 -
Blocking other engines in robots.txt
If your primary target of business is not in China is their any benefit to blocking Chinese search robots in robots.txt?
Technical SEO | | Romancing0 -
Is blocking RSS Feeds with robots.txt necessary?
Is it necessary to block an rss feed with robots.txt? It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html) And, google says here that it's important not to block RSS feeds (http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html) I'm just checking!
Technical SEO | | nicole.healthline0 -
Robots.txt and robots meta
I have an odd situation. I have a CMS that has a global robots.txt which has the generic User-Agent: *
Technical SEO | | Highland
Allow: / I also have one CMS site that needs to not be indexed ever. I've read in various pages (like http://www.jesterwebster.com/robots-txt-vs-meta-tag-which-has-precedence/22 ) that robots.txt always wins over meta, but I have also read that robots.txt indicates spiderability whereas meta can control indexation. I just want the site to not be indexed. Can I leave the robots.txt as is and still put NOINDEX in the robots meta?0