Removing CSS & JS Files from Index
-
Hi,
Google has indexed a few .CSS and .JS files that belong to our WordPress plugins and themes. I had them blocked via robots, but realized this doesn't prevent indexation (and can likely hurt us since Google wants to access these files).
I've since removed the robots instructions, submitted a removal request via Search Console, but want to make sure they don't come back.
Is there a way to put a noindex tag within .CSS and .JS files? Or should I do something with .htaccess instead?
-
I figured .htaccess would be the best route. Thank you for researching and confirming. I appreciate it.
-
Hi Tim,
Assigning a noindex tag to these files will not block them, only prevent them from showing in SERPs. This is the intended goal and the reason I deleted my robots.txt file which prevented crawling.
-
There's quite a big difference between crawling directives, which block and indexing directives. This article by (former?) Moz user S_ebastian_ is a good foundation read.
This article at developers.google.com is a good second read. If I'm understanding it right, Google thinks in terms of crawling directives vs indexing / serving directives.
My attempt at <tl rl="">:</tl>
crawling = looking, using in any way :: controlled via robots.txt
indexing / serving = indexing, archiving, displaying snippets in results, etc :: controlled via html meta tags or web server htaccess (or similar for other web servers).
I'm not convinced yet, that asking for noindex via htaccess causes the same sort of grief that deny in robots.txt causes.
-
I would seriously think again when it comes to blocking/no-indexing your CSS and JS files - Google has in the past stated that if they cannot fully render your site properly then this could lead to poorer rankings.
You will also likely get notifications in your Search Console as errors for this too.
Check out this great article from July this year which goes into more details.
-
I haven't encountered undesirable .css or .js indexing myself (yet), but as you surmised, maybe this htaccess directive might be worth trying?
<filesmatch ".(txt|log|xml|css|js)$"="">Header set X-Robots-Tag "noindex"</filesmatch>
Google seems to support it
-
Unless I'm severely misreading the links provided, which I've read before, it seems Google is stating that they read, render, and sometimes index .CSS and .JS files. Here's an article written a week after the second article you posted.
The aforementioned WordPress plugin and theme files hosted on my server are indeed showing up in Google SERPs.
I do not want to prevent Googlebot from reaching these files as they're needed for optimal site performance, but I do want them to be no-indexed. Thus, I don't want robots.txt to prevent crawling, only indexing.
Let me know if I'm misunderstanding.
-
TL;DR - You're hesitated about problem that doesn't exist.
Googlebot doesn't index CSS or JS files. They index text files, HTML, PDF, DOC, XLS, etc. But doesn't index style sheets or javascript files.
All you need in WordPress is to create blank robots.txt file where WP is installed with this content:
User-agent: *
Disallow:
Sitemap: http://site/sitemap-file-name.xmlAnd that's all. This is explain many times:
http://googlewebmastercentral.blogspot.bg/2014/05/understanding-web-pages-better.html
http://googlewebmastercentral.blogspot.bg/2014/10/updating-our-technical-webmaster.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to no index / no follow CAD files .dxf .dwg
Hi, I have a new Wordpress site with a number of CAD files (.dxf& .dwg) downloadable straight from the site. These have been flagged in MOZ as warnings with everying from No Title/Description to duplicate content. Does anybody now how I would no index these type of files? Many thanks.
Technical SEO | | Jon_Pearce0 -
Why did Google stop indexing my site?
Google used to crawl my site every few minutes. Suddenly it stopped and the last week it indexed 3 pages out of thousands. https://www.google.co.il/#q=site:www.yetzira.com&source=lnt&tbs=qdr:w&sa=X&ei=I9aTUfTTCaKN0wX5moCgAw&ved=0CBgQpwUoAw&bav=on.2,or.r_cp.r_qf.&fp=cfac44f10e55f418&biw=1829&bih=938 What could cause this to happen and how can I solve this problem? Thanks!
Technical SEO | | JillB20130 -
AJAX and Bing Indexation
Hello. I've been going back and forth with Bing technical support regarding a crawling issue on our website (which I have to say is pretty helpful - you do get a personal, thoughtful response pretty quickly from Bing). Currently our website is set with a java redirect to send users/crawlers to an AJAX version of our website. For example, they come into - mysite.com/category..and get redirected to mysite.com/category#!category. This is to provide an AJAX search overlay which improves UEx. We are finding that Bing gets 'hung up' on these AJAX pages, despite AJAX protocol being in place. They say that if the AJAX redirect is removed, they would index and crawl the non-AJAX url correctly - at which point our indexation would (theoretically) improve. I'm wondering if it's possible (or advisable) to direct the robots to crawl the non-AJAX version, while users get the AJAX version. I'm assuming that it's the classic - the bots want to see exactly what the users see - but I wanted to post here for some feedback. The reality of the situation is the AJAX overlay is in place and our rankings in Bing have plummeted as a result.
Technical SEO | | Blenny0 -
Google Not Indexed WWW name
Here is my domain - http://www.plugnbuy.com . When i see through "site" google not showing with WWW index but the same when i do without WWW.. it is showing in search. So yesturday i changed the setting from GWM to preferred domain as a WWW appear but today still not showing anything... Please help..
Technical SEO | | mamuti0 -
When is it safe to remove 301 redirects?
I have created over 500 301 redirects in my .htaccess file, some of them are more than 2 years old now. Should I delete them? I don't like seeing the "notices" number in crawl diagnostics so high 😞
Technical SEO | | danielshaw0 -
Problem with indexed files before domain was purchased
Hello everybody, We bought this domain a few months back and we're trying to figure out how to get rid of indexed pages that (i assume) existed before we bought this domain - the domain was registered in 2001 and had a few owners. I attached 3 files from my webmasters tools, can anyone tell me how to get rid of those "pages" and more important: aren't this kind of "pages" result of some kind of "sabotage"? Looking forward to hearing your thoughts on this. Thank you, Alex Picture-5.png Picture-6.png Picture-7.png
Technical SEO | | pwpaneuro0 -
Remove Deleted (but indexed) Pages Through Webmaster Tools?
I run a blog/directory site. Recently, I changed directory software and, as a result, Google is showing 404 Not Found crawling errors for about 750 non-existent pages. I've had some suggest that I should implement a 301 redirect, but can't see the wisdom in this as the pages are obscure, unlikely to appear in search and they've been deleted. Is the best course to simply manually enter each 404 error page in to the Remove Page option in Webmaster Tools? Will entering deleted pages into the Removal area hurt other healthy pages on my site?
Technical SEO | | JSOC0