Files blocked in robot.txt and seo
-
I use joomla and I have blocked the following in my robots.txt is there anything that is bad for seo ?
User-agent: *
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/
Disallow: /mailto:myemail@myemail.com/
Disallow: /javascript:void(0)
Disallow: /.pdf
-
What you have there is just blocking rootdomain.com/javascript:void(0). Googlebot can execute and index JavaScript; you should not block it without a good reason. I'd let it read the JavaScript and see the submenus.
-
Thank you and is blocking javascript bad ? ( I was thinking about submenus )
-
If you don't want pages in those Disallowed directories to be indexed, then you're doing fine. These pages won't be able to be crawled, so, they won't be likely to appear in search results for any search engines.
The last three entries look fishy to me. I'd need to know what types of URLs you're trying to block to fix them. For the last one, if you're looking to block all pdfs on your site, the syntax would be Disallow: /*.pdf.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Scary bug in search console: All our pages reported as being blocked by robots.txt after https migration
We just migrated to https and created 2 days ago a new property in search console for the https domain. Webmaster Tools account for the https domain now shows for every page in our sitemap the warning: "Sitemap contains urls which are blocked by robots.txt."Also in the dashboard of the search console it shows a red triangle with warning that our root domain would be blocked by robots.txt. 1) When I test the URLs in search console robots.txt test tool all looks fine.2) When I fetch as google and render the page it renders and indexes without problem (would not if it was really blocked in robots.txt)3) We temporarily completely emptied the robots.txt, submitted it in search console and uploaded sitemap again and same warnings even though no robots.txt was online4) We run screaming frog crawl on whole website and it indicates that there is no page blocked by robots.txt5) We carefully revised the whole robots.txt and it does not contain any row that blocks relevant content on our site or our root domain. (same robots.txt was online for last decade in http version without problem)6) In big webmaster tools I could upload the sitemap and so far no error reported.7) we resubmitted sitemaps and same issue8) I see our root domain already with https in google SERPThe site is https://www.languagecourse.netSince the site has significant traffic, if google would really interpret for any reason that our site is blocked by robots we will be in serious trouble.
Intermediate & Advanced SEO | | lcourse
This is really scary, so even if it is just a bug in search console and does not affect crawling of the site, it would be great if someone from google could have a look into the reason for this since for a site owner this really can increase cortisol to unhealthy levels.Anybody ever experienced the same problem?Anybody has an idea where we could report/post this issue?0 -
Ecommerce, SEO & Pagination
Hi I'm trying to workout if there's something wrong with our pagination. We include the rel="next" and "prev" on our pages. When clicking on page 2 on a product page, the URL will show as something like - /lockers#productBeginIndex:30&orderBy:5&pageView:list& However, if I search site:http://www.key.co.uk/en/key/lockers in Google, it seems to find paginated pages: http://www.key.co.uk/en/key/lockers?page=2 I have a feeling something is going wrong here, but haven't worked massively on Pagination before. Can anyone help?
Intermediate & Advanced SEO | | BeckyKey0 -
Disavow files on m.site
Hi I have a site www.example.com and finally have got the developers to add Google webmaster verification codes for: example.com m.example.com As I was advised this is best practice - however I was wondering does this mean I now need to add the disavow file. Thanks Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
How to make Google index your site? (Blocked with robots.txt for a long time)
The problem is the for the long time we had a website m.imones.lt but it was blocked with robots.txt.
Intermediate & Advanced SEO | | FCRMediaLietuva
But after a long time we want Google to index it. We unblocked it 1 week or 8 days ago. But Google still does not recognize it. I type site:m.imones.lt and it says it is still blocked with robots.txt What should be the process to make Google crawl this mobile version faster? Thanks!0 -
I have two sitemaps which partly duplicate - one is blocked by robots.txt but can't figure out why!
Hi, I've just found two sitemaps - one of them is .php and represents part of the site structure on the website. The second is a .txt file which lists every page on the website. The .txt file is blocked via robots exclusion protocol (which doesn't appear to be very logical as it's the only full sitemap). Any ideas why a developer might have done that?
Intermediate & Advanced SEO | | McTaggart0 -
Robots.txt error message in Google Webmaster from a later date than the page was cached, how is that?
I have error messages in Google Webmaster that state that Googlebot encountered errors while attempting to access the robots.txt. The last date that this was reported was on December 25, 2012 (Merry Christmas), but the last cache date was November 16, 2012 (http://webcache.googleusercontent.com/search?q=cache%3Awww.etundra.com/robots.txt&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a). How could I get this error if the page hasn't been cached since November 16, 2012?
Intermediate & Advanced SEO | | eTundra0 -
Local SEO for franchises
I have a client who franchises an ice cream shop. It started in Utah and there are several stores there. They are ranking well for local searches based in Utah. Now they have opened a store in Federal Way, WA. How can I get the new location to rank for local keywords on the same website?
Intermediate & Advanced SEO | | fivestarfranchising0 -
How Important is the IP Address for SEO?
Hi Everyone, I am curious to know if IP Address plays any role in SEO....What if a website sharing an I.P with a porn site, BlueFart site, fake viagra pills site etc.? Does it affect the SEO? Please share your opinion on this. Thanks
Intermediate & Advanced SEO | | seodoz0