Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Good robots txt for magento
-
Dear Communtiy,
I am trying to improve the SEO ratings for my website www.rijwielcashencarry.nl (magento). My next step will be implementing robots txt to exclude some crawling pages.
Does anybody have a good magento robots txt for me? And what need i copy exactly?Thanks everybody!
Greetings,
Bob
-
This is fine, as long as you don't want to exclude robots from crawling any part of your site.
-
Me to have this problem, if someone can help with setting root.txt
my webcurrent configuration is
Sitemap: http://www.myweb/sitemap.xml
User-agent: *
Disallow:THIS IS GOOD ?
-
Hi Ruth,
Also thanks for your response!
Greetings,
Bob
-
Hi Peter,
Thanks for your response! I am going to follow up your advice and build a good Robots TXT.
Greetings,
Bob
-
Peter is correct - your search, admin and user pages are common pages to block for Magento. What you block is up to you, though. Don't forget that a page that is blocked by robots.txt can still be found by search engines, so if it's a page that will contain private information you should protect it with a password.
-
Hi there! Did Peter's response take care of this for you? If so, please mark it as a "Good Answer."
-
Hi,
Creating robots.txt file for the site is one of the most important thing, you need to understand your website or stores basic needs what to keep private and what to make public; I think you need to block some part in your magento site like your search pages (?*sid) and admin pages, user dashboard pages, here is an example links Robots.txt for Magento and Robots.txt File Examples
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I customize Magento product urls?
I would like my product urls to be /category/manufacturer/name/part#. This would be the only url the item uses and how the product is accessed. It would also be used for product feeds. My first attempt was to use https://amasty.com/magento-unique-product-url.html This creates a single url but I can not customize it. Sometimes it selects the manufacturer and sometimes the category. My second attempt was with https://www.magentocommerce.com/magento-connect/custom-product-urls-seo.html I have it installed but it doesn't change the urls. Has anyone been able to do this successfully?
Technical SEO | | Tylerj0 -
Issues with Magento layered navigation
Hi, We use Magento v.1.7 for our store. We have recently had an SEO audit and we have uncovered 2 major issues which can be pinpointed to our layered navigation. We use the MANAdev layered navigation module. There are numerous options available to help with SEO. All our filtered urls seem to be fine ie. https://www.tidy-books.co.uk/childrens-bookcases-shelves/colour/natural-finish-with-letters/letters/lowercase have canonical url correctly setup and the meta tags as noindex, follow but Magento is churning out tons of 404 error pages like this https://www.tidy-books.co.uk/childrens-bookcases-shelves/show/12/l/colour:24-4-9/letters:6-7 which google is indexing I'm at lost at how to solve this any help would be great. Thank you **This is from our SEO audit report ** The faceted navigation isn’t handled correctly and causes two major issues:● One of the faceted navigation filters causes 404 error. This means that the error isappended each sequence of the navigation options, multiplying the faulty URLs.● The pages created by the faceted nav are all accessible to the search engines. Thismeans that there are hundreds of duplicated category pages created by one of theparameters. The duplication issues can seriously hinder the organic visibility.The amount of 404 errors and the duplicated pages created by faceted navigation makes italmost impossible for a search engine crawler to finish the crawl. This means that the sitemight not be fully indexed and the newly introduced product pages or content won’t bediscovered for a very long time.
Technical SEO | | tidybooks0 -
One robots.txt file for multiple sites?
I have 2 sites hosted with Blue Host and was told to put the robots.txt in the root folder and just use the one robots.txt for both sites. Is this right? It seems wrong. I want to block certain things on one site. Thanks for the help, Rena
Technical SEO | | renalynd270 -
Robots.txt on http vs. https
We recently changed our domain from http to https. When a user enters any URL on http, there is an global 301 redirect to the same page on https. I cannot find instructions about what to do with robots.txt. Now that https is the canonical version, should I block the http-Version with robots.txt? Strangely, I cannot find a single ressource about this...
Technical SEO | | zeepartner0 -
Log in, sign up, user registration and robots
Hi all, We have an accommodation site that asks users only to register when they want to book a room, in the last step. Though this is the ideal situation when you have tons of users, nowadays we are having around 1500 - 2000 per day and making tests we found out that if we ask for a registration (simple, 1 click FB) we mail them all and through a good customer service we are increasing our sales. That is why, we would like to ask users to register right after the home page ie Home/accommodation or and all the rest. I am not sure how can I make to make that content still visible to robots.
Technical SEO | | Eurasmus.com
Will the authentication process block google crawling it? Maybe something we can do? We are not completely sure how to proceed so any tip would be appreciated. Thank you all for answering.3 -
Empty Meta Robots Directive - Harmful?
Hi, We had a coding update and a side-effect of that was that our directive was emptied, in other words it now reads as: on all of the site. I've since noticed that Google's cache date on all of the pages - at least, the ones I tested - have a Cached date of no later than 17 December '12 - that's the Monday after the directive was removed on mass. So, A, does anyone have solid evidence of an empty directive causing problems? Past experience, Matt Cutts, Fishkin quote, etc. And then B - It seems fairly well correlated but, does my entire site's homogenous Cached date point to this tag removal? Or is it fairly normal to have a particular cache date across a large site (we're a large ecommerce site). Our site: http://www.zando.co.za/ I'm having the directive reinstated as soon as Dev permitting. And then, for extra credit, is there a way with Google's API, or perhaps some other tool, to run an arbitrary list and retrieve Cached dates? I'd want to do this for diagnosis purposes and preferably in a way that OK with Google. I'd avoid CURLing for the cached URL and scraping out that dates with BASH, or any such kind of thing. Cheers,
Technical SEO | | RocketZando0 -
Googlebot does not obey robots.txt disallow
Hi Mozzers! We are trying to get Googlebot to steer away from our internal search results pages by adding a parameter "nocrawl=1" to facet/filter links and then robots.txt disallow all URLs containing that parameter. We implemented this late august and since that, the GWMT message "Googlebot found an extremely high number of URLs on your site", stopped coming. But today we received yet another. The weird thing is that Google gives many of our nowadays robots.txt disallowed URLs as examples of URLs that may cause us problems. What could be the reason? Best regards, Martin
Technical SEO | | TalkInThePark0 -
Robots.txt file getting a 500 error - is this a problem?
Hello all! While doing some routine health checks on a few of our client sites, I spotted that a new client of ours - who's website was not designed built by us - is returning a 500 internal server error when I try to look at the robots.txt file. As we don't host / maintain their site, I would have to go through their head office to get this changed, which isn't a problem but I just wanted to check whether this error will actually be having a negative effect on their site / whether there's a benefit to getting this changed? Thanks in advance!
Technical SEO | | themegroup0