Need Help With Robots.txt on Magento eCommerce Site
-
Hello, I am having difficulty getting my robots.txt file to be configured properly. I am getting error emails from Google products stating they can't view our products because they are being blocked, and this past week, in my SEO dashboard, the URL's receiving search traffic dropped by almost 40%.
Is there anyone that can offer assistance on a good template robots.txt file I can use for a Magento eCommerce website?
The one I am currently using was found at this site here: e-commercewebdesign.co.uk/blog/magento-seo/magento-robots-txt-seo.php - However, I am getting problems from Google now because of it.
I searched and found this thread here: http://www.magentocommerce.com/wiki/multi-store_set_up/multiple_website_setup_with_different_document_roots#the_root_folder_robots.txt_file - But I felt like maybe I should get some additional help on properly configuring a robots for a Magento site.
Thanks in advance for any help. Please, let me know if you need more info to provide assistance.
-
You better back up your DB before doing that. Anyway, take a look at this MagentoConnect extension http://www.magentocommerce.com/magento-connect/MageWorx.com/extension/2852/seo-suite-enterprise#overview
or this one (it's by the same company
http://www.mageworx.com/seo-suite-pro-magento-extension.html
-
Thank you very much. We'll give that a shot and see how it goes. What started us tinkering with the robots file in the first place is that Bing Shopping told us it couldn't crawl our product images. Plus, our pdf files for product specs and manuals are all listed within the media folder. Do you have a suggestion for this? I would think we would get rid of "Disallow: /media/" and replace it with the following (what do you think?):
Disallow: /media/aitmanufacturers/
Disallow: /media/bigtom_media/
Disallow: /media/css/
Disallow: /media/downloadable/
Disallow: /media/easybanner/
Disallow: /media/geoip/
Disallow: /media/icons/
Disallow: /media/import/
Disallow: /media/js/
Disallow: /media/productsfeed/
Disallow: /media/sales/
Disallow: /media/tmp/
Disallow: /media/UPS/ -
Hello,
Below is what I use. You need to have the modrewrite enabled if you are going to disallow index.php and even then it's still very risky. This may be part of the issue. Robots.txt is so important, but you need to know what you are doing. Especially when disallowing as much as that UK site is.
Tyler
User-agent: *
Disallow: /*?
Disallow: /*.js$
Disallow: /*.css$
Disallow: /checkout/
Disallow: /catalogsearch/
Disallow: /review/
Disallow: /app/
Disallow: /downloader/
Disallow: /images/
Disallow: /js/
Disallow: /lib/
Disallow: /media/
Disallow: /*.php$
Disallow: /pkginfo/
Disallow: /report/
Disallow: /skin/
Disallow: /var/
Disallow: /customer/
Disallow: /enable-cookies/
Sitemap: http://domain.com/sitemap.xml
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using one robots.txt for two websites
I have two websites that are hosted in the same CMS. Rather than having two separate robots.txt files (one for each domain), my web agency has created one which lists the sitemaps for both websites, like this: User-agent: * Disallow: Sitemap: https://www.siteA.org/sitemap Sitemap: https://www.siteB.com/sitemap Is this ok? I thought you needed one robots.txt per website which provides the URL for the sitemap. Will having both sitemap URLs listed in one robots.txt confuse the search engines?
Technical SEO | | ciehmoz0 -
Need Advice on Categorizing Posts, Using Topics, Site Navigation & Structure
Hey there, My site had terrible categorization. I did a redesign, and essentially decided to start over using Topics instead of categories - which appear as my site's main navigation. Now I need to assign a Topic to all my posts. Is it safe to assign posts to multiple parent Topics from an SEO point of view? I want to do it since it would be helpful for users to find them in multiple locations some of the time, but I certainly don't want any SEO issues. Also, should I de-categorize all of my posts since I'm assigning them to my new hierarchical taxonomy - Topics? This is very important to finalize. Any help or advice is greatly appreciated. Thanks, Mike
Technical SEO | | naturalsociety0 -
Site Not Being Indexed
Hey Everyone - I have a site that is being treated strangely by google (at least strange to me) The site has 24 pages in the sitemap - submitted to WMT'S over 30 days ago I've manually triggered google to crawl the homepage and all connecting links as well and submitted a couple individually. Google has been parked the indexing at 14 of the 24 pages. None of the unindexed URL's have Noindex or follow tags on them - they are clearly and easily linked to from other places on the site. The site is a brand new domain, has no manual penalty history and in my research has no reason to be considered spammy. 100% unique handwritten content I cannot figure out why google isn't indexing these pages. Has anyone encountered this before? Know any solutions? Thanks in advance.
Technical SEO | | CRO_first0 -
Site hacked in Jan. Redeveloped new site. Still not ranking. Should we change domain?
Our top ranking site in the UK was hacked at the end of 2014. http://www.ultimatefloorsanding.co.uk/ The site was the subject of a manual spam action from Google. After several unsuccessful attempts to clean it up, using Securi.net and reinstating old versions of the site, changing passwords etc. we took the decision to redevelop the site. We also changed hosting provider as we had received absolutely no support from them whatsoever in resolving the issue. So far we have: Removed the old website files off the server Developed a new website having implemented 301's for all the old URL's (except the spam ones) Submitted a reconsideration request for the manual spam action, which was accepted. Disavowed all the spammy inbound links through Webmaster Tools Implemented custom URL parameters through Google to not index the SPAM URLs ( which were using parameters) Our organic traffic is down by 63% compared to last year, and we are not ranking for most of our target keywords any longer. Is there anything that I am missing in the actions I have taken so far? We were advised that at this stage changing domain and starting again might be the way to go. However the current domain has been used by us since 2007, so it would be a big call. Any advice is appreciated, thanks. Sue - http://www.ultimatefloorsanding.co.uk/
Technical SEO | | galwaygirl0 -
Small business sites banned by google. Please help.
Hi. My 2 sites http://www.painterdublin.com and http://www.tilers-dublin.com were banned by google update in November 2012. Both were ranking fairly well in search results: painterdublin generating cca 600/month and tilers-dublin cca 300/month organic traffic from google. After update it is about 70% less. Is there anyone willing to take a look at my pages and give me some advice about what to do to improve the situation? thank you very much
Technical SEO | | jarik0 -
Help with onpage keyword optimization, site architecture, and how those aspects affect the SERPs.
Hey guys, I've made a post or two before, but my story is that I've been learning SEO for a while now and have only recently (in the last four months) had the opportunity to actually apply what I've been reading about. What I've learned while trying to put these things into practice is that it can be pretty tough sledding, even when it comes to basic elements like keywords and search results. Anyway, to the good stuff. I've been helping my brother's startup company in my spare time because I want them to do well. They're on the last legs of their series A funding and have no money to put towards SEO, content marketing or social, so I'm helping when and where I can for free. The company is Maluuba, a siri-like personal assistant app for Android with a ton of different domains. They launched at TechCrunch Disrupt and actually have a lot of traction and a fair amount of publicity, so I'm not exactly working with scraps, but I don't work with them in their offices and only really communicate with my brother, who is having a really hard time getting buy-in for some of the stuff I want them to do. Their initial website was pretty terrible, so my brother got the okay to redesign the site and together, we worked with a designer to implement the site I linked to. Because they have so many domains (search, social, organization) I thought creating specific pages along with a one homepage would be a good way to optimize for different things and funnel a wider audience to convert to the one macro goal of the site: getting people to download the app. The results haven't been exactly what I expected and I fear I didn't really implement what I still think is a good plan correctly. I've only tried to optimize the pages for a few keywords to start. The main keyword for the homepage and indeed the brand is 'personal assistant app' which is a fairly competitive keyword that I know have them ranking second for on Google CA. I used 'siri-alternative' as a secondary keyword, since that's how they label themselves in the Play Store. For the three other main (pages search, social, organization) I used 'personal assistant app' as a secondary keyword and tried to optimize each page for 'search app', 'social app' and 'organizer app', respectively. While I'm really quite proud that I managed to get a page ranking in the top three for our main keyword, I'm just as disappointed that it's the search page and not the homepage, mainly because I have no idea why it's happening. So, all of that to ask a few questions: Did I make a mistake by trying to add funnels to the site? Or did I just go about optimizing the pages incorrectly? Why does the search page rank really, really well for 'personal assistant app' while the other pages - including the one I intended to rank the highest for that term - lag behind? I'd guess that Google is indexing this page alone as the main representative of 'personal assistant app', but that wasn't my intention. I'm also not using any rel=canonical tags, if that matters. Also, this page has been flipping around in the 1-3 range in the SERPs for about a month, but I still haven't noticed any traffic from 'personal assistant app'. Alright, this is getting way to long. I'd very much appreciate any and all insights as to what I'm doing wrong or what I'm missing. It could be really obvious and thus make this post silly, but I really have read and tried to learn a lot. I just can't see what's going on here because I don't have any experience to compare it to. Thanks in advance for any help. Cheers, JD
Technical SEO | | JDMcNamara1 -
Has anyone used a company to help promote their site
Hi, i receive around ten emails a day claiming they can help you get your site in the top ten in google, now i know most are a load of rubbish but i am just wondering if anyone has used any of these companies for a new site or an old site. I am about to launch a new site after xmas and i am just wondering if any of these companies are worth looking at to help promote the new site instead of doing all the ground work myself. Would love to know your thoughts
Technical SEO | | ClaireH-1848860 -
Robots.txt
My campaign hse24 (www.hse24.de) is not being crawled any more ... Do you think this can be a problem of the robots.txt? I always thought that Google and friends are interpretating the file correct, seen that he site was crawled since last week. Thanks a lot Bernd NB: Here is the robots.txt: User-Agent: * Disallow: / User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile User-agent: MSNBot User-agent: Slurp User-agent: yahoo-mmcrawler User-agent: psbot Disallow: /is-bin/ Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_DisplayProductInformation-Start Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/ Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/Root%20Edition/units/HSE24/Beratung/
Technical SEO | | remino630