Do i have my robots.txt file set up properly
-
Hi, just doing some seo on my site and i am not sure if i have my robots file set correctly. i use joomla and my website is www.in2town.co.uk.
here is my robots file, does this look correct to you
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/many thanks
-
thanks for this, i will add a sitemap now
-
thanks for this. been having for a long time trouble with a site map. the reason is, i use joomla 1.5 and i am not sure the best way to have it set or which is the best tool to use.
my articles change all the time and not sure how many of the articles i should have in the site map or to have just the sections.
on an old site i had all the articles, well up to 2,000 and that gain me a lot of traffic but with the new site i took that down
-
Yes, this does look good. However, usually the robots.txt will define a location of a sitemap. Not absolutely needed, but good to know.
Here is an example of one of our client's wordpress sites.
User-agent: * Disallow: /wp-admin Disallow: /another-post Disallow: /dolor-and-the-sit-amet/ Disallow: /hello-world-2-2/ Disallow: /second-page-post/ Disallow: /hello-world-2-3/ Disallow: /tag/ Disallow: /events/ Disallow: /wp-content/ Sitemap: http://backcountrysnow.com/sitemap.xml.gz
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Setting Up Ecommerce Functionalty for the First Time
Morning Mozers!
Technical SEO | | CheapyPP
We are running up against a technical url structure issue with the addition of eCommerce pages . We are hoping you can point us in the right direction. We operate a printing company so all our current product info page are structured like: website/printing/business-cards
website/printing/rackcards
website/printing/etc The ecommerce functionality needs to go into a sub folder but the question is what should we name it? this how the urls would look like in the main category and product pages
/business-cards
/business-cards/full-uv-coaing-both-sides we were thinking either going with /order
website/order/business-cards
website/order/business-cards/full-uv-coaing-both-sides or maybe shop/ or/print-order/ etc Any ideas or suggestions?0 -
Should I add my html sitemap to Robots?
I have already added the .xml to Robots. But should I also add the html version?
Technical SEO | | Trazo0 -
How should I set up a domain redirect
A client has 2 domains that he wants to use for the same site. At the moment one domain is just an abbreviation of the main domain (not sure why) as follows: www.mygreatpropertycompany.com
Technical SEO | | davidmaxwell
www.mgpc.com (just redirects to the above) He is complaining that when he searches for 'mpc' there are no results (at all) so I'm wondering what the best approach is.There is currently nothing on the main domain that refers to 'mgpc' in it's abbreviated sense - the only place it's being used is the company's email addresses (info@mgpc.com). The redirect is simply a html file in the root of www.mgpc.com as follows: Is there anything I can do to help him out? (this is one of those 'doing a friend a favour' tasks!) Thanks!0 -
When doing internal linking back to your home/index file what is the best coding course of action?
When doing internal linking back to your home/index page is it best to set the code as linked to "www.thedomain.com" or "www.thedomain.com/" or just "/" - I'm attempting some canonicalization and our programmer is concerned about linking to just the URL as he's saying it's going to be viewed as an external source. We have www redirects in place that come back to just www.thedomain.com and a redirect to send the www.thedomain.com/index.php back to just www.thedomain.com . Any help would be appreciated, thank you!
Technical SEO | | CharlesDaniels0 -
Oh no googlebot can not access my robots.txt file
I just receive a n error message from google webmaster Wonder it was something to do with Yoast plugin. Could somebody help me with troubleshooting this? Here's original message Over the last 24 hours, Googlebot encountered 189 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%. Recommended action If the site error rate is 100%: Using a web browser, attempt to access http://www.soobumimphotography.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot. If your robots.txt is a static page, verify that your web service has proper permissions to access the file. If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure. If the site error rate is less than 100%: Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors. The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website. After you think you've fixed the problem, use Fetch as Google to fetch http://www.soobumimphotography.com//robots.txt to verify that Googlebot can properly access your site.
Technical SEO | | BistosAmerica0 -
Duplicate content problem from an index.php file
Hi One of my sites is flagging a duplicate content problem which is affecting the search rankings. The duplicate problem is caused by http://www.mydomain.com/index.php which has a page rank of 26 How can I sort the duplicate content problem, as the main page should just be http://www.mydomain.com which has a page rank of 42 and is the stronger page with stronger links etc Many Thanks
Technical SEO | | ocelot0 -
Same URL in "Duplicate Content" and "Blocked by robots.txt"?
How can the same URL show up in Seomoz Crawl Diagnostics "Most common errors and warnings" in both the "Duplicate Content"-list and the "Blocked by robots.txt"-list? Shouldnt the latter exclude it from the first list?
Technical SEO | | alsvik0 -
Un-Indexing a Page without robots.txt or access to HEAD
I am in a situation where a page was pushed live (Went live for an hour and then taken down) before it was supposed to go live. Now normally I would utilize the robots.txt or but I do not have access to either and putting a request in will not suffice as it is against protocol with the CMS. So basically I am left to just utilizing the and I cannot seem to find a nice way to play with the SE to get this un-indexed. I know for this instance I could go to GWT and do it but for clients that do not have GWT and for all the other SE's how could I do this? Here is the big question here: What if I have a promotional page that I don't want indexed and am met with these same limitations? Is there anything to do here?
Technical SEO | | DRSearchEngOpt0