Scary bug in search console: All our pages reported as being blocked by robots.txt after https migration
-
We just migrated to https and created 2 days ago a new property in search console for the https domain.
Webmaster Tools account for the https domain now shows for every page in our sitemap the warning: "Sitemap contains urls which are blocked by robots.txt."Also in the dashboard of the search console it shows a red triangle with warning that our root domain would be blocked by robots.txt. 1) When I test the URLs in search console robots.txt test tool all looks fine.2) When I fetch as google and render the page it renders and indexes without problem (would not if it was really blocked in robots.txt)3) We temporarily completely emptied the robots.txt, submitted it in search console and uploaded sitemap again and same warnings even though no robots.txt was online4) We run screaming frog crawl on whole website and it indicates that there is no page blocked by robots.txt5) We carefully revised the whole robots.txt and it does not contain any row that blocks relevant content on our site or our root domain. (same robots.txt was online for last decade in http version without problem)6) In big webmaster tools I could upload the sitemap and so far no error reported.7) we resubmitted sitemaps and same issue8) I see our root domain already with https in google SERPThe site is https://www.languagecourse.netSince the site has significant traffic, if google would really interpret for any reason that our site is blocked by robots we will be in serious trouble.
This is really scary, so even if it is just a bug in search console and does not affect crawling of the site, it would be great if someone from google could have a look into the reason for this since for a site owner this really can increase cortisol to unhealthy levels.Anybody ever experienced the same problem?Anybody has an idea where we could report/post this issue? -
Hi icourse, thanks for your question! You've received some thoughtful responses. Did any of them help you sort your issue out? If so, please mark one or more as a "Good Answer." Thanks!
Christy
-
I'd still speak with the hosting provider. It may be a firewall setting, not the CDN.
-
Hi Donna, we are using cloudflare which may have blocked your scan/bot.
We disabled cloudflare temporarily and submitted new robots.txt and new sitemap and still get the same warnings. -
I recently updated a Wordpress website from noindex to wanting it indexed and I still had a warning in Search Console for a day or two and the homepage was initially indexed with the meta description saying "A description for this website ...", even though the actual Fetch and Render and also Robots.txt test was just fine. If you are absolutely sure there isn't anything wrong, I would maybe give it a couple of days. In our case, I resubmitted the homepage to Google to speed up the process and that fixed it.
-
Have you checked with your website hosting provider? They may be blocking bots at a server level. I know when I tried to scan your site I got a connection timeout error.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Old domain with HTTPS to new domain with HTTPS
I am a bit boggled about https to https we redirected olddomain.com to https://www.newdomain.com, but redirecting https://www.olddomain.com or non-www is not possible. because the certificate does not exist on a level where you are redirecting. only if I setup a new host and add a htaccess file will this work. What should I do? just redirect the rest and hope for the best?
Intermediate & Advanced SEO | | waqid0 -
What should I do after a failed request for validation (error with noindex, nofollow) in new Google Search Console?
Hi guys, We have the following situation: After an error message in new google search console for a large amount of pages with noindex, nofollow tag, a validation is requested before the problem is fixed. (it's incredibly stupid decision taken before asking the SEO team for advice) Google starts the validation, crawls 9 URLs and changes the status to "Failed". All other URLs are still in "pending" status. The problem has been fixed for more than 10 days, but apparently Google doesn't crawl the pages and none of the URLs is back in the index. We tried pinging several pages and html sitemaps, but there is no result. Do you think we should request for re-validation or wait more time? It there something more we could do to speed up the process?
Intermediate & Advanced SEO | | ParisChildress0 -
Need references to a company that can transition our 1000 page website from Http to Https without breaking our SEO backlinks and site structure
Hi Ya'll I'm looking for a company or independent who can transition our website from http to https. I want to make sure they know what they're doing with a Wordpress website. More importantly, i want to make sure they don't break any seo juice from external sources while internally nothing gets broken. Anyone have any good recommendations? You can reply back or DM me. Best, Shawn
Intermediate & Advanced SEO | | Shawn1240 -
Robots.txt gone wild
Hi guys, a site we manage, http://hhhhappy.com received an alert through web master tools yesterday that it can't be crawled. No changes were made to the site. Don't know a huge amount about the robots.txt configuration expect that using Yoast by default it sets it not to crawl wp admin folder and nothing else. I checked this against all other sites and the settings are the same. And yet 12 hours later after the issue Happy is still not being crawled and meta data is not showing in search results. Any ideas what may have triggered this?
Intermediate & Advanced SEO | | wearehappymedia0 -
Pages Disappearing from Search
Hi, We have had a strongly ranking site since 2004. Over the past couple of days, our Google traffic has dropped by around 20% and some of our strong pages are completely disappearing from the rankings. They are still indexed, but having ranked number 1 are nowhere to be found. A number of pages still remain intact, but it seems they are increasingly disappearing. Where should we start to try and find out what is happening? Thanks
Intermediate & Advanced SEO | | simonukss0 -
meta robots no follow on page for paid links
Hi I have a page containing paid links. i would like to add no follow attribute to these links
Intermediate & Advanced SEO | | Kung_fu_Panda
but from technical reasons, i can only place meta robots no follow on page level (
is that enough for telling Google that the links in this page are paid and and to prevent Google penlizling the sites that the page link to? Thanks!0 -
Page 1 Reached, Further Page Improvements and What Next ?
Moz, I have a particularly tricky competitive keyword that i have finally climbed our website to the 10th position of page 1, i am particularly pleased about this as all of the website and content is German which i have little understanding of and i have little support on this, I am pleased with the content and layout of the page and i am monitoring all Google Analytics values very closely, as well as the SERP positions, So as far as further progression with this page and hopefully climbing further up page 1, where do you think i should focus my efforts ? Page Speed optimization?, Building links to this page ?, blogging on this topic (with links) , Mobile responsive design (More difficult), further improvements to pages and content linked from this page ? Further improvements to the website in general?,further effort on tracking visitors and user experience monitoring (Like setting up Crazyegg or something?) Any other ideas would be greatly appreciated, Thanks all, James
Intermediate & Advanced SEO | | Antony_Towle0 -
Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
Hi! I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us. The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this: User-agent: googlebot Disallow: /community/photos/ Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible? Thanks! Leona
Intermediate & Advanced SEO | | HD_Leona0