Do i have my robots.txt file set up properly
-
Hi, just doing some seo on my site and i am not sure if i have my robots file set correctly. i use joomla and my website is www.in2town.co.uk.
here is my robots file, does this look correct to you
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/many thanks
-
thanks for this, i will add a sitemap now
-
thanks for this. been having for a long time trouble with a site map. the reason is, i use joomla 1.5 and i am not sure the best way to have it set or which is the best tool to use.
my articles change all the time and not sure how many of the articles i should have in the site map or to have just the sections.
on an old site i had all the articles, well up to 2,000 and that gain me a lot of traffic but with the new site i took that down
-
Yes, this does look good. However, usually the robots.txt will define a location of a sitemap. Not absolutely needed, but good to know.
Here is an example of one of our client's wordpress sites.
User-agent: * Disallow: /wp-admin Disallow: /another-post Disallow: /dolor-and-the-sit-amet/ Disallow: /hello-world-2-2/ Disallow: /second-page-post/ Disallow: /hello-world-2-3/ Disallow: /tag/ Disallow: /events/ Disallow: /wp-content/ Sitemap: http://backcountrysnow.com/sitemap.xml.gz
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt allows wp-admin/admin-ajax.php
Hello, Mozzers!
Technical SEO | | AndyKubrin
I noticed something peculiar in the robots.txt used by one of my clients: Allow: /wp-admin/admin-ajax.php What would be the purpose of allowing a search engine to crawl this file?
Is it OK? Should I do something about it?
Everything else on /wp-admin/ is disallowed.
Thanks in advance for your help.
-AK:2 -
How to properly do reviews with Rich Text Snippets
I am trying to find out the best way to do this. Do you use hreview? Thanks,
Technical SEO | | netviper0 -
Where Is This Being Addended to Our Page File Names?
I have worked over the last several months to eliminate duplicate page titles at our site. Below is one situation that I need your advice on. Google Webmaster Tools is reporting several of our pages with
Technical SEO | | lbohen
duplicate title such as this one: This is a valid page at our Web store: http://www.audiobooksonline.com/159179126X.html This is an invalid page that Google says is a duplicate of the one above: http://www.audiobooksonline.com/159179126X.html?gdftrk=gdfV2138_a_7c177_a_7c432_a_7c9781591791263 Where might the code ?gdftrk=.... be coming from? How to get rid of it?0 -
Timely use of robots.txt and meta noindex
Hi, I have been checking every possible resources for content removal, but I am still unsure on how to remove already indexed contents. When I use robots.txt alone, the urls will remain in the index, however no crawling budget is wasted on them, But still, e.g having 100,000+ completely identical login pages within the omitted results, might not mean anything good. When I use meta noindex alone, I keep my index clean, but also keep Googlebot busy with indexing these no-value pages. When I use robots.txt and meta noindex together for existing content, then I suggest Google, that please ignore my content, but at the same time, I restrict him from crawling the noindex tag. Robots.txt and url removal together still not a good solution, as I have failed to remove directories this way. It seems, that only exact urls could be removed like this. I need a clear solution, which solves both issues (index and crawling). What I try to do now, is the following: I remove these directories (one at a time to test the theory) from the robots.txt file, and at the same time, I add the meta noindex tag to all these pages within the directory. The indexed pages should start decreasing (while useless page crawling increasing), and once the number of these indexed pages are low or none, then I would put the directory back to robots.txt and keep the noindex on all of the pages within this directory. Can this work the way I imagine, or do you have a better way of doing so? Thank you in advance for all your help.
Technical SEO | | Dilbak0 -
I accidentally blocked Google with Robots.txt. What next?
Last week I uploaded my site and forgot to remove the robots.txt file with this text: User-agent: * Disallow: / I dropped from page 11 on my main keywords to past page 50. I caught it 2-3 days later and have now fixed it. I re-imported my site map with Webmaster Tools and I also did a Fetch as Google through Webmaster Tools. I tweeted out my URL to hopefully get Google to crawl it faster too. Webmaster Tools no longer says that the site is experiencing outages, but when I look at my blocked URLs it still says 249 are blocked. That's actually gone up since I made the fix. In the Google search results, it still no longer has my page title and the description still says "A description for this result is not available because of this site's robots.txt – learn more." How will this affect me long-term? When will I recover my rankings? Is there anything else I can do? Thanks for your input! www.decalsforthewall.com
Technical SEO | | Webmaster1230 -
Is there a reason to set a crawl-delay in the robots.txt?
I've recently encountered a site that has set a crawl-delay command set in their robots.txt file. I've never seen a need for this to be set since you can set that in Google Webmaster Tools for Googlebot. They have this command set for all crawlers, which seems odd to me. What are some reasons that someone would want to set it like that? I can't find any good information on it when researching.
Technical SEO | | MichaelWeisbaum0 -
Warnings for blocked by blocked by meta-robots/meta robots Nofollow...how to resolve?
Hello, I see hundreds of notices for blocked by meta-robots/meta robots nofollow and it appears it is linked to the comments on my site which I assume I would not want to be crawled. Is this the case and these notices are actually a positive thing? Please advise how to clear them up if these notices can be potentially harmful for my SEO. Thanks, Talia
Technical SEO | | M80Marketing0 -
What does it mean by 'blocked by Meta Robot'? How do I fix this?
When i get my crawl diagnostics, I am getting a blocked by Meta Robot, which means that my page is not being indexed in the search engines... obviously this is a major issue for organic traffic!!! What does it actually mean, and how can i fix it?
Technical SEO | | rolls1230