Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Good robots txt for magento
-
Dear Communtiy,
I am trying to improve the SEO ratings for my website www.rijwielcashencarry.nl (magento). My next step will be implementing robots txt to exclude some crawling pages.
Does anybody have a good magento robots txt for me? And what need i copy exactly?Thanks everybody!
Greetings,
Bob
-
This is fine, as long as you don't want to exclude robots from crawling any part of your site.
-
Me to have this problem, if someone can help with setting root.txt
my webcurrent configuration is
Sitemap: http://www.myweb/sitemap.xml
User-agent: *
Disallow:THIS IS GOOD ?
-
Hi Ruth,
Also thanks for your response!
Greetings,
Bob
-
Hi Peter,
Thanks for your response! I am going to follow up your advice and build a good Robots TXT.
Greetings,
Bob
-
Peter is correct - your search, admin and user pages are common pages to block for Magento. What you block is up to you, though. Don't forget that a page that is blocked by robots.txt can still be found by search engines, so if it's a page that will contain private information you should protect it with a password.
-
Hi there! Did Peter's response take care of this for you? If so, please mark it as a "Good Answer."
-
Hi,
Creating robots.txt file for the site is one of the most important thing, you need to understand your website or stores basic needs what to keep private and what to make public; I think you need to block some part in your magento site like your search pages (?*sid) and admin pages, user dashboard pages, here is an example links Robots.txt for Magento and Robots.txt File Examples
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
302 redirects in Magento, trying to fix
Hi all, I'm assigned a site in Magento. After the first craw, we found almost 15k 302 redirects. A sample URL ends with this /stores/store/switch/?SID=qdq9mf1u6afgodo1vtvk0ucdpb&___from_store=default&___store=german&uenc=aHR0cHM6Ly9qdWljeWZsdXRlcy5jb20vP19fX3N0b3JlPWdlcm1hbg%2C%2C And they are currently 302 redirecting to the homepage as well as other main pages and also product pages it seems. Some of these point to account pages where customers log in. Probably best for me to de-index those so no issues there. But I'm worried about the 302 redirects to public pages. The extension we have installed is SEO Suite Ultimate by MageWorx. Does anyone here have experience here specifically and how did you fix it? Thanks, JC
Technical SEO | | LASClients0 -
Robots.txt allows wp-admin/admin-ajax.php
Hello, Mozzers!
Technical SEO | | AndyKubrin
I noticed something peculiar in the robots.txt used by one of my clients: Allow: /wp-admin/admin-ajax.php What would be the purpose of allowing a search engine to crawl this file?
Is it OK? Should I do something about it?
Everything else on /wp-admin/ is disallowed.
Thanks in advance for your help.
-AK:2 -
Crawl solutions for landing pages that don't contain a robots.txt file?
My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader1 -
Robots.txt in subfolders and hreflang issues
A client recently rolled out their UK business to the US. They decided to deploy with 2 WordPress installations: UK site - https://www.clientname.com/uk/ - robots.txt location: UK site - https://www.clientname.com/uk/robots.txt
Technical SEO | | lauralou82
US site - https://www.clientname.com/us/ - robots.txt location: UK site - https://www.clientname.com/us/robots.txt We've had various issues with /us/ pages being indexed in Google UK, and /uk/ pages being indexed in Google US. They have the following hreflang tags across all pages: We changed the x-default page to .com 2 weeks ago (we've tried both /uk/ and /us/ previously). Search Console says there are no hreflang tags at all. Additionally, we have a robots.txt file on each site which has a link to the corresponding sitemap files, but when viewing the robots.txt tester on Search Console, each property shows the robots.txt file for https://www.clientname.com only, even though when you actually navigate to this URL (https://www.clientname.com/robots.txt) you’ll get redirected to either https://www.clientname.com/uk/robots.txt or https://www.clientname.com/us/robots.txt depending on your location. Any suggestions how we can remove UK listings from Google US and vice versa?0 -
Issues with Magento layered navigation
Hi, We use Magento v.1.7 for our store. We have recently had an SEO audit and we have uncovered 2 major issues which can be pinpointed to our layered navigation. We use the MANAdev layered navigation module. There are numerous options available to help with SEO. All our filtered urls seem to be fine ie. https://www.tidy-books.co.uk/childrens-bookcases-shelves/colour/natural-finish-with-letters/letters/lowercase have canonical url correctly setup and the meta tags as noindex, follow but Magento is churning out tons of 404 error pages like this https://www.tidy-books.co.uk/childrens-bookcases-shelves/show/12/l/colour:24-4-9/letters:6-7 which google is indexing I'm at lost at how to solve this any help would be great. Thank you **This is from our SEO audit report ** The faceted navigation isn’t handled correctly and causes two major issues:● One of the faceted navigation filters causes 404 error. This means that the error isappended each sequence of the navigation options, multiplying the faulty URLs.● The pages created by the faceted nav are all accessible to the search engines. Thismeans that there are hundreds of duplicated category pages created by one of theparameters. The duplication issues can seriously hinder the organic visibility.The amount of 404 errors and the duplicated pages created by faceted navigation makes italmost impossible for a search engine crawler to finish the crawl. This means that the sitemight not be fully indexed and the newly introduced product pages or content won’t bediscovered for a very long time.
Technical SEO | | tidybooks0 -
Is sitemap required on my robots.txt?
Hi, I know that linking your sitemap from your robots.txt file is a good practice. Ok, but... may I just send my sitemap to search console and forget about adding ti to my robots.txt? That's my situation: 1 multilang platform which means... ... 2 set of pages. One for each lang, of course But my CMS (magento) only allows me to have 1 robots.txt file So, again: may I have a robots.txt file woth no sitemap AND not suffering any potential SEO loss? Thanks in advance, Juan Vicente Mañanas Abad
Technical SEO | | Webicultors0 -
Self Referencing Links - Good or Bad?
As an agency we get quite a few of our clients come to us saying "Ooo, this company just contacted me saying they've run an SEO report on my site and we need to improve on these following things" We had one come through the other day that had reported on something we had not seen in any others before. They called them self-referencing links and marked it as a point of action should be taken. They had stated that 100% of the pages on our clients website had self-referencing links. The definition of self-referencing is when there is a link on a page that is linking to the page you are currently on. So for example you're on the home page and there is a link in the nav bar at the top that says "Home" with a link to the home page, the page you are already currently on. Is it bad practice? And if so can we do anything about it as it would seem strange from a UI point of view not to have a consistent navigation. I have not heard anything about this before but I wanted to get confirmation before going back to our client and explaining. Thanks Mozzers!
Technical SEO | | O2C0 -
Empty Meta Robots Directive - Harmful?
Hi, We had a coding update and a side-effect of that was that our directive was emptied, in other words it now reads as: on all of the site. I've since noticed that Google's cache date on all of the pages - at least, the ones I tested - have a Cached date of no later than 17 December '12 - that's the Monday after the directive was removed on mass. So, A, does anyone have solid evidence of an empty directive causing problems? Past experience, Matt Cutts, Fishkin quote, etc. And then B - It seems fairly well correlated but, does my entire site's homogenous Cached date point to this tag removal? Or is it fairly normal to have a particular cache date across a large site (we're a large ecommerce site). Our site: http://www.zando.co.za/ I'm having the directive reinstated as soon as Dev permitting. And then, for extra credit, is there a way with Google's API, or perhaps some other tool, to run an arbitrary list and retrieve Cached dates? I'd want to do this for diagnosis purposes and preferably in a way that OK with Google. I'd avoid CURLing for the cached URL and scraping out that dates with BASH, or any such kind of thing. Cheers,
Technical SEO | | RocketZando0