Robots.txt - blocking JavaScript and CSS, best practice for Magento
-
Hi Mozzers,
I'm looking for some feedback regarding best practices for setting up Robots.txt file in Magento.
I'm concerned we are blocking bots from crawling essential information for page rank.
My main concern comes with blocking JavaScript and CSS, are you supposed to block JavaScript and CSS or not?
You can view our robots.txt file here
Thanks,
Blake
-
As Joost said, you should not block access to files with help in the reading / rendering of the page.
Looking at your Robots file, I would look at the following two exclusions. Do they block anything else that runs on a live page that Google should be seeing?
Disallow: /includes/ Disallow: /scripts/ -Andy
-
Best practice is not to block access to JS / CSS anymore, to allow google to properly understand the website and give determine mobile-friendliness.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Structure & Best Practice when Facing 4+ Sub-levels
Hi. I've spent the last day fiddling with the setup of a new URL structure for a site, and I can't "pull the trigger" on it. Example: - domain.com/games/type-of-game/provider-name/name-of-game/ Specific example: - arcade.com/games/pinball/deckerballs/starshooter2k/ The example is a good description of the content that I have to organize. The aim is to a) define url structure, b) facilitate good ux, **c) **create a good starting point for content marketing and SEO, avoiding multiple / stuffing keywords in urls'. The problem? Not all providers have the same type of game. Meaning, that once I get past the /type-of-game/, I must write a new category / page / content for /provider-name/. No matter how I switch the different "sub-levels" around in the url, at one point, the provider-name doesn't fit as its in need of new content, multiple times. The solution? I can skip "provider-name". The caveat though is that I lose out on ranking for provider keywords as I don't have a cornerstone content page for them. Question: Using the URL structure as outlined above in WordPress, would you A) go with "Pages", or B) use "Posts"
Intermediate & Advanced SEO | | Dan-Louis0 -
SEO Best Practices regarding Robots.txt disallow
I cannot find hard and fast direction about the following issue: It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs? I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ? Thank you!!
Intermediate & Advanced SEO | | jamiegriz0 -
H1 tags and keywords for subpages, is it best practice to reuse the keywords?
So let's say I have a parent page for shoes, and I have subpages for dress shoes, work shoes, play shoes, then inside each of those pages I have dress shoe cleaning, dress shoe repair, same for work and play shoes. Would it be ok to use h1 tags like this: Shoes > Dress Shoes > Dress Shoe Cleaning Dress Shoe Repair Work Shoes > Work Shoe Cleaning Work Shoe Repair Play Shoes > Play Shoe Cleaning Play Shoe Repair Would these be considered duplicate h1 tags since cleaning and repair are used for each subpage? In certain niche companies, it's rather difficult to use synonyms for keywords. Or is it ok to just keep things simple and use Shoes > Dress Shoes > Cleaning and so on? Especially since we have urls and breadcrumbs that are structured nicely using keywords, for this example both breadcrumbs and urls read like sitename.com/shoes/dress-shoes/cleaning. Any advice?
Intermediate & Advanced SEO | | Deacyde0 -
Practical steps to increase Domain Authority
Having read the Neil Patel's guide to DA, I am still at a loss as to practical steps, I can do to help improve my Domain Authority. This is a summary of findings / action plan so far: 1. Building quality incoming links by producing excellent content that people will love! 2. Having "related articles" to keep users on site for longer, and provide more information I am certain I am missing some more steps? More worryingly my DA has gone down from 8 to 5! I do not know how to improve it? Please help with real practical steps that I can use.
Intermediate & Advanced SEO | | propertysaviour0 -
Dilemma about "images" folder in robots.txt
Hi, Hope you're doing well. I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
Intermediate & Advanced SEO | | Modbargains1 -
Massive URL blockage by robots.txt
Hello people, In May there has been a dramatic increase in blocked URLs by robots.txt, even though we don't have so many URLs or crawl errors. You can view the attachment to see how it went up. The thing is the company hasn't touched the text file since 2012. What might be causing the problem? Can this result any penalties? Can indexation be lowered because of this? ?di=1113766463681
Intermediate & Advanced SEO | | moneywise_test0 -
What is the best way to link between all my portals?
Hi I own 12 different portals within gambling, they do more or less work and feel like this one, Casinotopplisten, what is the best way for me to link between all of them? Since there is alot going on in Google these days I havent linked between the sites at all, but i feel that to be a somewhat waste. So here is my three ideas so far, in ranked order: Add a menu at the topp right of the site, or footer, that links to the 10 different sites with different languages. The text link should then only be "Norwegian, Swedish, English etc.." Basiclly the same as about, but in addition linking to the "same page" in the other languages. As all pages have the same article set for startes this can be done. Dont do any linking between the sites and only link to the sites separately from our company blog/site.. Dont link at all. I should add that all of these sites are on different IPs with different domains and all in different languages. Hope someone can add their 2c on this one.. Thanks!
Intermediate & Advanced SEO | | MortenBratli0 -
Should we block urls like this - domainname/shop/leather-chairs.html?brand=244&cat=16&dir=ascℴ=price&price=1 within the robots.txt?
I've recently added a campaign within the SEOmoz interface and received an alarming number of errors ~9,000 on our eCommerce website. This site was built in Magento, and we are using search friendly url's however most of our errors were duplicate content / titles due to url's like: domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=1 and domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=4. Is this hurting us in the search engines? Is rogerbot too good? What can we do to cut off bots after the ".html?" ? Any help would be much appreciated 🙂
Intermediate & Advanced SEO | | MonsterWeb280