Quick robots.txt check
-
We're working on an SEO update for http://www.gear-zone.co.uk at the moment, and I was wondering if someone could take a quick look at the new robots file (http://gearzone.affinitynewmedia.com/robots.txt) to make sure we haven't missed anything?
Thanks
-
Plus - look around! Check out other companies Robots.txt file
http://edition.cnn.com/robots.txt
http://www.nytimes.com/robots.txt
You can see what they do not think is relevant for search engines to be looking at.
-
It's ok but very basic:
User-agent: * Disallow: /myaccount/ Sitemap: /sitemap.xml
Do you want to stop crawlers from accessing the login page for example?
Ours is something like this:
# Disallow All Engines From Admin and Login User-Agent: * Disallow: /index.php/ User-Agent: * Disallow: /index.php/admin/ User-Agent: * Disallow: /customer/account/login/
# Sitemap Files sitemap: http://www.worldofbooks.com/sitemap.xml
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search console says 'sitemap is blocked by robots?
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt." I don't understand why my sitemap is being blocked? My robots.txt look like this: User-Agent: *
Technical SEO | | Extima-Christian
Disallow: Sitemap: http://www.website.com/sitemap_index.xml It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?1 -
Robots.txt Syntax for Dynamic URLs
I want to Disallow certain dynamic pages in robots.txt and am unsure of the proper syntax. The pages I want to disallow all include the string ?Page= Which is the proper syntax?
Technical SEO | | btreloar
Disallow: ?Page=
Disallow: ?Page=*
Disallow: ?Page=
Or something else?0 -
Robots.txt vs. meta noindex, follow
Hi guys, I wander what your opinion is concerning exclution via the robots.txt file.
Technical SEO | | AdenaSEO
Do you advise to keep using this? For example: User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/* Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is. Regards,
Tom Vledder0 -
How to use robots.txt to block areas on page?
Hi, Across the categories/product pages on out site there are archives/shipping info section and the texts are always the same. Would this be treated as duplicated content and harmful for seo? How can I alter robots.txt to tell google not to crawl those particular text Thanks for any advice!
Technical SEO | | LauraHT0 -
A few misc Webmaster tools questions & Robots.txt etc
Hi I have a few general misc questions re Robots.tx & GWT: 1) In the Robots.txt file what do the below lines block, internal search ? Disallow: /?
Technical SEO | | Dan-Lawrence
Disallow: /*? 2) Also the sites feeds are blocked in robots.txt, why would you want to block a sites feeds ? **3) **What's the best way to deal with the below: - old removed page thats returning a 500 response code ? - a soft 404 for an old removed page that has no current replacement old removed pages returning a 404 The old pages didn't have any authority or inbound links hence is it best/ok to simply create a url removal request in GWT ? Cheers Dan0 -
Confirming Robots.txt code deep Directories
Just want to make sure I understand exactly what I am doing If I place this in my Robots.txt Disallow: /root/this/that By doing this I want to make sure that I am ONLY blocking the directory /that/ and anything in front of that. I want to make sure that /root/this/ still stays in the index, its just the that directory I want gone. Am I correct in understanding this?
Technical SEO | | cbielich0 -
How to automate the process of checking the operators
How to automate the process of checking the operators of search teams.
Technical SEO | | meteorr
such as:
inurl:? lang = ru site: tochka.net
inurl: print site: tochka.net
inurl: print site: tochka.net / *
inurl: nomobile = 1 site: tochka.net / *
inurl: comments site: tochka.net
inurl: comments site: tochka.net / *
inurl:? a_aid site: tochka.net ... with the conclusion of the number of pages in the search. There is a program to identify?0 -
Very Quick Joomla Question
Hi, A client's site was previously built in Joomla and he wants us to reproduce content that was in there, but the Joomla site is no longer live and has come to me as an archive containing all the files and folders that were included. So, I am looking at the files and folders without Joomla installed. Can someone tell me quickly how to find the where the actual page content was stored? I started looking, but there are some folders I cannot open and nothing that looks as I expected. Would appreciate a hint or two from someone who knows Joomla well.. Life is too short! Thanks Sha
Technical SEO | | ShaMenz0