Scary bug in search console: All our pages reported as being blocked by robots.txt after https migration
-
We just migrated to https and created 2 days ago a new property in search console for the https domain.
Webmaster Tools account for the https domain now shows for every page in our sitemap the warning: "Sitemap contains urls which are blocked by robots.txt."Also in the dashboard of the search console it shows a red triangle with warning that our root domain would be blocked by robots.txt. 1) When I test the URLs in search console robots.txt test tool all looks fine.2) When I fetch as google and render the page it renders and indexes without problem (would not if it was really blocked in robots.txt)3) We temporarily completely emptied the robots.txt, submitted it in search console and uploaded sitemap again and same warnings even though no robots.txt was online4) We run screaming frog crawl on whole website and it indicates that there is no page blocked by robots.txt5) We carefully revised the whole robots.txt and it does not contain any row that blocks relevant content on our site or our root domain. (same robots.txt was online for last decade in http version without problem)6) In big webmaster tools I could upload the sitemap and so far no error reported.7) we resubmitted sitemaps and same issue8) I see our root domain already with https in google SERPThe site is https://www.languagecourse.netSince the site has significant traffic, if google would really interpret for any reason that our site is blocked by robots we will be in serious trouble.
This is really scary, so even if it is just a bug in search console and does not affect crawling of the site, it would be great if someone from google could have a look into the reason for this since for a site owner this really can increase cortisol to unhealthy levels.Anybody ever experienced the same problem?Anybody has an idea where we could report/post this issue? -
Hi icourse, thanks for your question! You've received some thoughtful responses. Did any of them help you sort your issue out? If so, please mark one or more as a "Good Answer." Thanks!
Christy
-
I'd still speak with the hosting provider. It may be a firewall setting, not the CDN.
-
Hi Donna, we are using cloudflare which may have blocked your scan/bot.
We disabled cloudflare temporarily and submitted new robots.txt and new sitemap and still get the same warnings. -
I recently updated a Wordpress website from noindex to wanting it indexed and I still had a warning in Search Console for a day or two and the homepage was initially indexed with the meta description saying "A description for this website ...", even though the actual Fetch and Render and also Robots.txt test was just fine. If you are absolutely sure there isn't anything wrong, I would maybe give it a couple of days. In our case, I resubmitted the homepage to Google to speed up the process and that fixed it.
-
Have you checked with your website hosting provider? They may be blocking bots at a server level. I know when I tried to scan your site I got a connection timeout error.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Recovery from a HTTP to HTTPs migration using 302s ?
If a website did an HTTP to HTTPS migration using 302 re-directs, that were corrected to 301s about 4 months later, what is the expected impact? Will the website see a full recovery or has the damage been done? Thanks to anyone who can shed some light on this...
Intermediate & Advanced SEO | | yaelslater0 -
Why does Google rank a product page rather than a category page?
Hi, everybody In the Moz ranking tool for one of our client's (the client sells sport equipment) account, there is a trend where more and more of their landing pages are product pages instead of category pages. The optimal landing page for the term "sleeping bag" is of course the sleeping bag category page, but Google is sending them to a product page for a specific sleeping bag.. What could be the critical factors that makes the product page more relevant than the category page as the landing page?
Intermediate & Advanced SEO | | Inevo0 -
Home Page or Internal Page
I have a website that deals with personalized jewelry, and our main keyword is "Name Necklace".
Intermediate & Advanced SEO | | Tiedemann_Anselm
3 mounth ago i added new page: http://www.onecklace.com/name-necklaces/ And from then google index only this page for my main keyword, and not our home page.
Beacuase the page is new, and we didn't have a lot of link to it, our rank is not so well. I'm considering to remove this page (301 to home page), beacause i think that if google index our home page for this keyword it will be better. I'm not sure if this is a good idea, but i know that our home page have a lot of good links and maybe our rank will be higher. Another thing, because google index this internal page for this keyword, it looks like our home page have no main keyword at all. BTW, before i add this page, google index our main page with this keyword. Please advise... U5S8gyS.png j50XHl4.png0 -
To index search results or to not index search results?
What are your feelings about indexing search results? I know big brands can get away with it (yelp, ebay, etc). Apart from UGC, it seems like one of the best ways to capture long tail traffic at scale. If the search results offer valuable / engaging content, would you give it a go?
Intermediate & Advanced SEO | | nicole.healthline0 -
How Do You Remove Video Thumbnails From Google Search Result Pages?
This is going to be a long question, but, in a nutshell, I am asking if anyone knows how to remove video thumbnails from Google's search result pages? We have had video thumbnails show up next to many of our organic listings in Google's search result pages for several months. To be clear, these are organic listings for our site, not results from performing a video search. When you click on the thumbnail or our listing title, you go to the same page on our site - a list of products or the product page. Although it was initially believed that these thumbnails drew the eye to our listings and that we would receive more traffic, we are actually seeing severe year over year declines in traffic to our category pages with thumbnails vs. category pages without thumbnails (where average rank remained relatively constant). We believe this decline is due to several things: An old date stamp that makes our listing look outdated (despite the fact that we can prove Google has spidered and updated their cache of these pages as recent as 2 days ago). We have no idea where Google is getting this datestamp from. An unrelated thumbnail to the page title, etc. - sometimes a picture of a man's face when the category is for women's handbags A difference in intent - user intends to shop or browse, not watch a video. They skip our listing because it looks like a video even though both the thumbnail and our listing click through to a category page of products. So we want to remove these video thumbnails from Google's search results without removing our pages from the index. Does anyone know how to do this? We believed that this connection between category page and video was happening in our video sitemap. We have removed all reference to video and category pages in the sitemap. After making this change and resubmitting the sitemap in Webmaster Tools, we have not seen any changes in the search results (it's been over 2 weeks). I've been reading and it appears many believe that Google can identify video embedded in pages. That makes sense. We can certainly remove videos from our category pages to truly remove the connection between category page URL and video thumbnail. However, I don't believe this is enough because in some cases you can find video thumbnails next to listings where the page has not had a video thumbnail in months (example: search for "leather handbags" and find www.ebags.com/category/handbags/m/leather - that video does not exist on that page and has not for months. Similarly, do a search for "handbags" and find www.ebags.com/department/handbags. That video has not been on that page since 2010. Any ideas?
Intermediate & Advanced SEO | | SharieBags0 -
301 Re-Directs Puzzling Question on Page Returned in Search Results
On our website, www.BusinessBroker.net, we have 3 different versions of essentially the same page for each of our State Business for Sale Pages. Back in August, we did a test and did 301 redirects using 5 States. For a long while after doing the redirects, the pages fell out of Google search results - we used to get page 1 rankings. Just recently they started popping back up on Page 1. However, I noticed that the new page meta data is not what is being picked up -- here is the example. Keyword Searched for in Google -- "Maine Business for Sale" Our listing shows up on Page 1 -- # 8 Result URL returned is correct preferred version: - http://www.businessbroker.net/state/maine-Businesses_For_Sale.aspx However, the Page Title on this returned page is still the OLD page title - OLD TITLE -- maine Business for Sale Ads - maine Businesses for Sale & Business Brokers - Sell a Business on Business Broker Not the title that is designated for this page - New Title - Maine Businesses for Sale - Buy or Sell a Business in ME | BusinessBroker.net Ditto for Meta Description. Why is this happening? Also have a problem with lower case showing up rather than upper case -- what's causing this? http://www.businessbroker.net/state/maine-Businesses_For_Sale.aspx versus -- http://www.businessbroker.net/State/Maine-Businesses_For_Sale.aspx Any help would be appreciated. Thanks, MM
Intermediate & Advanced SEO | | MWM37720 -
What content should I block in wodpress with robots.txt?
I need to know if anyone has tips on creating a good robots.txt. I have read a lot of info, but I am just not clear on what I should allow and not allow on wordpress. For example there are pages and posts, then attachments, wp-admin, wp-content and so on. Does anyone have a good robots.txt guideline?
Intermediate & Advanced SEO | | ENSO0 -
Robots.txt disallow subdomain
Hi all, I have a development subdomain, which gets copied to the live domain. Because I don't want this dev domain to get crawled, I'd like to implement a robots.txt for this domain only. The problem is that I don't want this robots.txt to disallow the live domain. Is there a way to create a robots.txt for this development subdomain only? Thanks in advance!
Intermediate & Advanced SEO | | Partouter0