Robots.txt Disallow: / in Search Console
-
Two days ago I found out through search console that my website's Robots.txt has changed to
User-agent: *
Disallow: /When I check the robots.txt in the website it looks fine - I see its blocked just in search console( in the robots.txt tester).
when I try to do fetch as google to the homepage I see its blocked. Any ideas why would robots.txt block my website? it was fine until the weekend.
- before that, in the last 3 months I saw I had blocked resources in the website and I brought back pages with fetch as google.
Any ideas?
-
Hello Ran,
Just to clarify, in search console, when you go to Crawl-> Robots.txt Tester and in the middle right clicking en el see live robots.txt, it doesnt show the correct file?
It could be that google isnt recrawling the new robots.txt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search Console Errors 400 and 405
Hi, Does anyone know if search console errors showing as follows are damaging to serps: /xmlrpc.php is returning 405 error /wp-admin/admin-ajax.php is returning 400 error These errors seem to of coincided almost to the day that there was a ranking drop for the primary keyword from mid page 1 to bottom of page 2. No matter what I do I cannot seem to correct these errors. Any advice would be greatly appreciated. Thanks
Technical SEO | | DaleZon0 -
Home Page Being Indexed / Referral URLs /
I have a few questions related to home page URLs being indexed, canonicalization, and GA reporting... 1. I can view the home page by typing in domain.com , domain.com/ and domain.com/index.htm There are no redirects and it's canonicalized to point to domain.com/index.htm -- how important is it to have redirects? I don't want unnecessary redirects or canonical tags, but I noticed the trailing slash can sometimes be typed in manually on other pages, sometimes not. 2. When I do a site search (site:domain.com), sometimes the HP shows up as "domain.com/", never "domain.com/index.htm" or "domain.com", and sometimes the HP doesn't show up period. This seems to change several times a day, sometimes within 15 minutes. I have no idea what is causing it and I don't know if it has anything to do with #1. In a perfect world, I would ask for the /index.htm to be dropped and redirected to .com/, and the canonical to point to .com/ 3. I've noticed in GA I see / , /index.htm, and a weird Google referral URL (/index.htm?referrer=https://www.google.com/) all showing up as top pages. I think the / and /index.htm is because I haven't setup a default URL in GA, but I'm not sure what would cause the referrer. I tracked back when the referrer URL started to show up in the top pages, and it was right around the time they moved over to https://, so I'm not sure what the best option is to remove that. I know this is a lot - I appreciate any insight anyone can provide.
Technical SEO | | DigMS0 -
SEO advice on ecommerce url structure where categories contain "/c/"
Hi! We use Hybris as plattform and I would like input on which url to choose. We must keep "/c/" before the actual category. c stands for category. I.e. this current url format will be shortened and cleaned:
Technical SEO | | hampgunn
https://www.granngarden.se/Sortiment/Husdjur/Hund/Hundfoder-%26-Hundmat/c/hundfoder To either: a.
https://www.granngarden.se/husdjur/hund/hundfoder/c/hundfoder b.
https://www.granngarden.se/husdjur/hund/c/hundfoder (hundfoder means dogfood) The question is whether we should keep the duplicated category name (hundfoder) before the "/c/" or not. Will there be SEO disadvantages by removing the duplicate "hundfoder" before the "/c/"? I prefer the shorter version ofc, but do not want to jeopardize any SEO rankings or send confusing signals to search engines or customers due to the "/c/" breaking up the url breadcrumb. What do you guys say and prefer from the above alternatives? Thanks /Hampus0 -
Image Search / sudden drop in traffic
One of our sites in Germany had a very sudden drop in traffic (starting Oct. 7th). The site gets most of it's organic traffic from Image Search. Checking in Search Console revealed that search volume for keywords increased in that period our average position is stable our click rate dropped dramatically (we double checked - searching the keyword in "anonymous mode" still showed our results for main keywords in top image positions (first 2 rows)). As an example (see attached screencopy) - keyword had clickrate of 1% (average) - dan dropped to 0.06% while the position remained stable. Germany is still using the "old" version of image search (unlike the rest of the world) - which gives the site preview rather than just the image slider when you click on a result in image search. Our first thought that this was changed - but it seems that it didn't change. Ideas what might cause this dramatic drop in click%? There have been no major technical modifications on the site for the last 2 months. thanks, Dirk GjlV8CW.jpg
Technical SEO | | DirkC0 -
Robots.txt blocking Addon Domains
I have this site as my primary domain: http://www.libertyresourcedirectory.com/ I don't want to give spiders access to the site at all so I tried to do a simple Disallow: / in the robots.txt. As a test I tried to crawl it with Screaming Frog afterwards and it didn't do anything. (Excellent.) However, there's a problem. In GWT, I got an alert that Google couldn't crawl ANY of my sites because of robots.txt issues. Changing the robots.txt on my primary domain, changed it for ALL my addon domains. (Ex. http://ethanglover.biz/ ) From a directory point of view, this makes sense, from a spider point of view, it doesn't. As a solution, I changed the robots.txt file back and added a robots meta tag to the primary domain. (noindex, nofollow). But this doesn't seem to be having any effect. As I understand it, the robots.txt takes priority. How can I separate all this out to allow domains to have different rules? I've tried uploading a separate robots.txt to the addon domain folders, but it's completely ignored. Even going to ethanglover.biz/robots.txt gave me the primary domain version of the file. (SERIOUSLY! I've tested this 100 times in many ways.) Has anyone experienced this? Am I in the twilight zone? Any known fixes? Thanks. Proof I'm not crazy in attached video. robotstxt_addon_domain.mp4
Technical SEO | | eglove0 -
What's wrong with this robots.txt
Hi. really struggling with the robots.txt file
Technical SEO | | Leonie-Kramer
this is it: User-agent: *
Disallow: /product/ #old sitemap
Disallow: /media/name.xml When testing in w3c.org everything looks good, testing is okay, but when uploading it to the server, Google webmaster tools gives 3 errors. Checked it with my collegue we both don't know what's wrong. Can someone take a look at this and give me the solution.
Thanx in advance! Leonie1 -
Site wide search v catalogue search
I have a client building a new web site who has agreed that a site search function is a good thing in order to get a view on how customers are using the site, the search terms they are using as a source of keywords etc. The problem is the developer has implemented a catalogue/product search which only queries the products in the database. On the one hand this is fine in that the search is directing users to products and not to other areas of the site. But the customer is disappointed that the search is not site wide. Are there any solutions where third party search utility could be implemented whithin the site which will search both? The ecommerce platform is Magento. Any views would be very helpful!
Technical SEO | | k3nn3dy30 -
Robots.txt questions...
All, My site is rather complicated, but I will try to break down my question as simply as possible. I have a robots.txt document in the root level of my site to disallow robot access to /_system/, my CMS. This looks like this: # /robots.txt file for http://webcrawler.com/
Technical SEO | | Horizon
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/ I have another robots.txt file in another level down, which is my holiday database - www.mysite.com/holiday-database/ - this is to disallow access to /holiday-database/ControlPanel/, my database CMS. This looks like this: **User-agent: ***
Disallow: /ControlPanel/ Am I correct in thinking that this file must also be in the root level, and not in the /holiday-database/ level? If so, should my new robots.txt file look like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /holiday-database/ControlPanel/ Or, like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /ControlPanel/ Thanks in advance. Matt0