Robots.txt blocked internal resources Wordpress
-
Hi all,
We've recently migrated a Wordpress website from staging to live, but the robots.txt was deleted. I've created the following new one:
User-agent: *
Allow: /
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Allow: /wp-admin/admin-ajax.phpHowever, in the site audit on SemRush, I now get the mention that a lot of pages have issues with blocked internal resources in robots.txt file. These blocked internal resources are all cached and minified css elements: links, images and scripts.
Does this mean that Google won't crawl some parts of these pages with blocked resources correctly and thus won't be able to follow these links and index the images? In other words, is this any cause for concern regarding SEO?
Of course I can change the robots.txt again, but will urls like https://example.com/wp-content/cache/minify/df983.js end up in the index?
Thanks for your thoughts!
-
Thanks for the answer!
Last question: is /wp-admin/admin-ajax.php an important part that has to be crawled? I found this explanation: https://wordpress.stackexchange.com/questions/190993/why-use-admin-ajax-php-and-how-does-it-work/191073#191073
However, on this specific website there is no html at all when I check the source code, only one line with 0 on it.
-
I would leave all the disallows out except for the /wp-admin/ section. For example, I'd rewrite the robots.txt file to read:
User-agent: *
Disallow: /wp-admin/Also, you kind of want Google to index your cached content. In the event your servers go down it will still be able to make your content available.
I hope that helps. Let me know how that works out for you!
-
Thanks for the clear answer.
I've changed the robots.txt to:
User-agent: *
Allow: /
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/themes/
Allow: /wp-admin/admin-ajax.phpThis should avoid problems with not indexing (parts of) cached content.
Or should I leave all the Disallows out?
-
Hey there --
Blocking resources with the robots.txt file prevents search engines from crawling content the no-index tag would be better suited for preventing content from being indexed.
However, previous best practice would dictate blocking access to /wp-includes/ and /wp-content/ directories, etc but that's no longer necessary.
Today, Google will fetch all your styling and JavaScript files so they can render your pages completely. Search engines now try to understand your page's layout and presentation as a key part of how they evaluate quality.
So, yeah this might have some impact on your SEO.
Also, if you're using a plugin to cache content you should want Google to crawl your cache content. And in my experience, Googlebot does a good job of not indexing /wp-content/ sections.
So, for your example page, https://example.com/wp-content/cache/minify/df983.js it shouldn't end up in their index.
Hope this helps some.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is Wordpress Website Backup Service Worth the Investment?
I was horrified to learn that my hosting company, InMotion Hosting does not offer redundant backups, that it is on the customer to set up backups to ensure they don't lose their data. I plan to back up to Google Drive 3 x a week for 12 backups and also create 3 backups on our server (Sunday, Tuesday, Thursday). So if something goes wrong and we catch it within a week we can generate the backup directly from our server. There are website backup services such as BlogVault. Do they offer any meaningful advantages to taking the contents of the entire server (16 gigs) and backing it up? They do offer Malware removal. Does this have value? Is back up on an external service like Google Cloud while simultaneously backing up on the server a safe way to proceed? If not, what is the simplest and most effective manner to backup? I prefer to avoid adding any plugins to WordPress as our site already has too many (about 30). Thanks!!
Intermediate & Advanced SEO | | Kingalan1
Alan1 -
Benefit of internal link in content
Hi, Is there a real benefit to having internal links in content other than at the bottom of a page for example and not surrounded by content. Would the benefit be 1 to 10 or 1 to 1.5 ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Internal Links - Different URLs
Hey so, In my product page, I have recommended products at the bottom. The issue is that those recommended products have long parameters such as sitename.com/product-xy-z/https%3A%2F%2Fwww.google.co&srcType=dp_recs The reason why it has that long parameter is due to tracking purposes (internally with the dev and UX team). My question is, should I replace it with the clean URL or as long as it has the canonical tag, it should be okay to have such a long parameter? I would think clean URL would help with internal links and what not...but if it already has a canonical tag would it help? Another issue is that the URL is different and not just the parameter. For instance..the canonical URL is sitename.com/productname-xyz/ and so the internal link used on the product page (same exact page just different URL with parameter) sitename.com/xyz/https%3A%2F%2Fwww.google.co&srcType=dp_recs (missing product name), BUT still has the canonical tag!
Intermediate & Advanced SEO | | ggpaul5620 -
All of my blog titles have disappeared. In need of Wordpress help.
Not sure if this is the right place to ask this question but here it goes. All of the titles on my real estate website have disappeared. I have spent hours looking through different forums trying to figure out how to make them show up. Also whenever I hover the cursor over links they turn to white and disappear as well. This is the website: http://www.acolerealty.com/blog/ If this helps here is the custom CSS in worpress is the following: /* GREEN */ body {background: #eff3ec !important;} .header-membership {
Intermediate & Advanced SEO | | artscube.biz
background: #fff !important;
box-shadow: none !important;
border-bottom: 2px solid #e5e9e3 !important;
} .header-membership a {
color: #909090 !important;
text-shadow: none !important
} h1#site-title a {
color: #397249 !important;
} header nav#main-nav {
background: #7aad79 !important; /* Old browsers /
background: -moz-linear-gradient(top, #7aad79 0%, #397249 100%) !important; / FF3.6+ /
background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,#7aad79), color-stop(100%,#397249)) !important; / Chrome,Safari4+ /
background: -webkit-linear-gradient(top, #7aad79 0%,#397249 100%); / Chrome10+,Safari5.1+ /
background: -o-linear-gradient(top, #7aad79 0%,#397249 100%) !important; / Opera 11.10+ /
background: -ms-linear-gradient(top, #7aad79 0%,#397249 100%) !important; / IE10+ /
background: linear-gradient(to bottom, #7aad79 0%,#397249 100%) !important; / W3C /
filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#7aad79', endColorstr='#397249',GradientType=0 ) !important; / IE6-9 */
} #t-header-container .home-search-container #header-top-search::before {
background: #7aad79 !important; /* Old browsers /
background: -moz-linear-gradient(top, #7aad79 0%, #397249 100%) !important; / FF3.6+ /
background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,#7aad79), color-stop(100%,#397249)) !important; / Chrome,Safari4+ /
background: -webkit-linear-gradient(top, #7aad79 0%,#397249 100%); / Chrome10+,Safari5.1+ /
background: -o-linear-gradient(top, #7aad79 0%,#397249 100%) !important; / Opera 11.10+ /
background: -ms-linear-gradient(top, #7aad79 0%,#397249 100%) !important; / IE10+ /
background: linear-gradient(to bottom, #7aad79 0%,#397249 100%) !important; / W3C /
filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#7aad79', endColorstr='#397249',GradientType=0 ) !important; / IE6-9 */
} input.button-primary {
background: #7aad79 !important; /* Old browsers /
background: -moz-linear-gradient(top, #7aad79 0%, #397249 100%) !important; / FF3.6+ /
background: -webkit-gradient(linear, left top, left bottom, color-stop(0%,#7aad79), color-stop(100%,#397249)) !important; / Chrome,Safari4+ /
background: -webkit-linear-gradient(top, #7aad79 0%,#397249 100%); / Chrome10+,Safari5.1+ /
background: -o-linear-gradient(top, #7aad79 0%,#397249 100%) !important; / Opera 11.10+ /
background: -ms-linear-gradient(top, #7aad79 0%,#397249 100%) !important; / IE10+ /
background: linear-gradient(to bottom, #7aad79 0%,#397249 100%) !important; / W3C /
filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#7aad79', endColorstr='#397249',GradientType=0 ) !important; / IE6-9 */ border:1px solid #23472d !important;
} input.button-primary:hover {
background: #628b61 !important;
} footer {
background: #e4e8e1 !important;
}0 -
Are these Bad Internal Links/Anchor Text?
Hi my site www.over50choices.co.uk is 4 months old and I wondered whether my "Quick Links" section (right hand column) on 95% of my pages with the same/similar anchor text was not best practice ie should I vary the anchor text & the target locations more? ( they tend to point to my top 6 pages) They were set up originally to make the customer experience easy to find things but from what i have read Google doesnt like too many links looking the same ! I also have 3 Graphics (cross sales messages) just above the foot of most (not the home page) pages, linking to my 3 key value pages, all with similar Alt Text tags, again should i vary the alt text or is not a good idea to have this type of link on every page? What is best practice, as i am trying to balance the visual/customer experience whilst optimising for search? Thanks
Intermediate & Advanced SEO | | AshShep1
Ash0 -
Meta No INDEX and Robots - Optimizing Crawl Budget
Hi, Sometime ago, a few thousand pages got into Google's index - they were "product pop up" pages, exact duplicates of the actual product page but a "quick view". So I deleted them via GWT and also put in a Meta No Index on these pop up overlays to stop them being indexed and causing dupe content issues. They are no longer within the index as far as I can see, i do a site:www.mydomain.com/ajax and nothing appears - So can I block these off now with robots.txt to optimize my crawl budget? Thanks
Intermediate & Advanced SEO | | bjs20100 -
International Domain and URL Method of Preference
I'm seeing varied opinions and methods preferred for domain/URL structure on international websites. A specific example we have now is an international brand in Asia, USA, Brazil/South America, Australia, New Zealand and Africa. Their current domains are all fragmented across the brand and our goal is to have them unified, examples of their issue here; country.brand.com
Intermediate & Advanced SEO | | Cuker
www.brand.com.au
www.brand.co.nz What I'm looking for is an approach that will have the best long term impact but no short term losses as well. I'm leaning toward www.brand.com.eu or www.brand.com/eu/ Looking at SERP's for other countries, subdomain geographic segmenting doesn't seem to show on any first pages in the SERPs. There is one other option I'm still interested in finding out more about, geographically segmenting sites and pages through canonical or hreflang. Interested in hearing some additional POV's. Thanks! Anthony0 -
Should I block temporary pages
I need some SEO advice on an odd scenario: We are launching a new product line (party supplies) on it's own domain (PartySuperCenter.com). Due to some internal/technical reasons we will not be able to launch the site until the summer. We already have the product in our warehouse so the owners want to created a section on our current site (CostumeSuperCenter.com) for the new products. Once the new site is up the product will be removed from our current site and moved to the new site. I am concerned about the effect this will have on our SEO - having thousands of product pages appear and then disappear after a few months. I was thinking about blocking the pages using the "noindex" tag. Is this how you would handle it? Thanks in advance for your help!
Intermediate & Advanced SEO | | costume0