Scary bug in search console: All our pages reported as being blocked by robots.txt after https migration
-
We just migrated to https and created 2 days ago a new property in search console for the https domain.
Webmaster Tools account for the https domain now shows for every page in our sitemap the warning: "Sitemap contains urls which are blocked by robots.txt."Also in the dashboard of the search console it shows a red triangle with warning that our root domain would be blocked by robots.txt. 1) When I test the URLs in search console robots.txt test tool all looks fine.2) When I fetch as google and render the page it renders and indexes without problem (would not if it was really blocked in robots.txt)3) We temporarily completely emptied the robots.txt, submitted it in search console and uploaded sitemap again and same warnings even though no robots.txt was online4) We run screaming frog crawl on whole website and it indicates that there is no page blocked by robots.txt5) We carefully revised the whole robots.txt and it does not contain any row that blocks relevant content on our site or our root domain. (same robots.txt was online for last decade in http version without problem)6) In big webmaster tools I could upload the sitemap and so far no error reported.7) we resubmitted sitemaps and same issue8) I see our root domain already with https in google SERPThe site is https://www.languagecourse.netSince the site has significant traffic, if google would really interpret for any reason that our site is blocked by robots we will be in serious trouble.
This is really scary, so even if it is just a bug in search console and does not affect crawling of the site, it would be great if someone from google could have a look into the reason for this since for a site owner this really can increase cortisol to unhealthy levels.Anybody ever experienced the same problem?Anybody has an idea where we could report/post this issue? -
Hi icourse, thanks for your question! You've received some thoughtful responses. Did any of them help you sort your issue out? If so, please mark one or more as a "Good Answer." Thanks!
Christy
-
I'd still speak with the hosting provider. It may be a firewall setting, not the CDN.
-
Hi Donna, we are using cloudflare which may have blocked your scan/bot.
We disabled cloudflare temporarily and submitted new robots.txt and new sitemap and still get the same warnings. -
I recently updated a Wordpress website from noindex to wanting it indexed and I still had a warning in Search Console for a day or two and the homepage was initially indexed with the meta description saying "A description for this website ...", even though the actual Fetch and Render and also Robots.txt test was just fine. If you are absolutely sure there isn't anything wrong, I would maybe give it a couple of days. In our case, I resubmitted the homepage to Google to speed up the process and that fixed it.
-
Have you checked with your website hosting provider? They may be blocking bots at a server level. I know when I tried to scan your site I got a connection timeout error.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site moved. Unable to index page : Noindex detected in robots meta tag?!
Hope someone can shed some light on this: We moved our smaller site (into the main site ( different domains) . The smaller site that was moved ( https://www.bluegreenrentals.com)
Intermediate & Advanced SEO | | bgvsiteadmin
Directory where the site was moved (https://www.bluegreenvacations.com/rentals) Each page from the old site was 301 redirected to the appropriate page under .com/rentals. But we are seeing a significant drop in rankings and traffic., as I am unable to request a change of address in Google search console (a separate issue that I can elaborate on). Lots of (301 redirect) new destination pages are not indexed. When Inspected, I got a message : Indexing allowed? No: 'index' detected in 'robots' meta tagAll pages are set as Index/follow and there are no restrictions in robots.txtHere is an example URL :https://www.bluegreenvacations.com/rentals/resorts/colorado/innsbruck-aspen/Can someone take a look and share an opinion on this issue?Thank you!0 -
Why are these pages showing up as 404 in Search Console?
This page is showing up as a 404 in Google Search console- https://www.wolfautomation.com/blog/autonics/ It shows it has been linked from these pages- https://www.wolfautomation.com/blog/raffel/ https://www.wolfautomation.com/blog/new-temperature-controllers-from-autonics/ https://www.wolfautomation.com/blog/ge-industrial/ https://www.wolfautomation.com/blog/temp-controller/ https://www.wolfautomation.com/blog/tx4s/ I never created this page, I don't want this page but it keeps showing up. The problem is the link isn't found on those pages anywhere so I can't delete it. What am I missing? How can I get rid of it?
Intermediate & Advanced SEO | | Tylerj0 -
Robots.txt Allowed
Hello all, We want to block something that has the following at the end: http://www.domain.com/category/product/some+demo+-text-+example--writing+here So I was wondering if doing: /*example--writing+here would work?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
How Do You Remove Video Thumbnails From Google Search Result Pages?
This is going to be a long question, but, in a nutshell, I am asking if anyone knows how to remove video thumbnails from Google's search result pages? We have had video thumbnails show up next to many of our organic listings in Google's search result pages for several months. To be clear, these are organic listings for our site, not results from performing a video search. When you click on the thumbnail or our listing title, you go to the same page on our site - a list of products or the product page. Although it was initially believed that these thumbnails drew the eye to our listings and that we would receive more traffic, we are actually seeing severe year over year declines in traffic to our category pages with thumbnails vs. category pages without thumbnails (where average rank remained relatively constant). We believe this decline is due to several things: An old date stamp that makes our listing look outdated (despite the fact that we can prove Google has spidered and updated their cache of these pages as recent as 2 days ago). We have no idea where Google is getting this datestamp from. An unrelated thumbnail to the page title, etc. - sometimes a picture of a man's face when the category is for women's handbags A difference in intent - user intends to shop or browse, not watch a video. They skip our listing because it looks like a video even though both the thumbnail and our listing click through to a category page of products. So we want to remove these video thumbnails from Google's search results without removing our pages from the index. Does anyone know how to do this? We believed that this connection between category page and video was happening in our video sitemap. We have removed all reference to video and category pages in the sitemap. After making this change and resubmitting the sitemap in Webmaster Tools, we have not seen any changes in the search results (it's been over 2 weeks). I've been reading and it appears many believe that Google can identify video embedded in pages. That makes sense. We can certainly remove videos from our category pages to truly remove the connection between category page URL and video thumbnail. However, I don't believe this is enough because in some cases you can find video thumbnails next to listings where the page has not had a video thumbnail in months (example: search for "leather handbags" and find www.ebags.com/category/handbags/m/leather - that video does not exist on that page and has not for months. Similarly, do a search for "handbags" and find www.ebags.com/department/handbags. That video has not been on that page since 2010. Any ideas?
Intermediate & Advanced SEO | | SharieBags0 -
301 Re-Directs Puzzling Question on Page Returned in Search Results
On our website, www.BusinessBroker.net, we have 3 different versions of essentially the same page for each of our State Business for Sale Pages. Back in August, we did a test and did 301 redirects using 5 States. For a long while after doing the redirects, the pages fell out of Google search results - we used to get page 1 rankings. Just recently they started popping back up on Page 1. However, I noticed that the new page meta data is not what is being picked up -- here is the example. Keyword Searched for in Google -- "Maine Business for Sale" Our listing shows up on Page 1 -- # 8 Result URL returned is correct preferred version: - http://www.businessbroker.net/state/maine-Businesses_For_Sale.aspx However, the Page Title on this returned page is still the OLD page title - OLD TITLE -- maine Business for Sale Ads - maine Businesses for Sale & Business Brokers - Sell a Business on Business Broker Not the title that is designated for this page - New Title - Maine Businesses for Sale - Buy or Sell a Business in ME | BusinessBroker.net Ditto for Meta Description. Why is this happening? Also have a problem with lower case showing up rather than upper case -- what's causing this? http://www.businessbroker.net/state/maine-Businesses_For_Sale.aspx versus -- http://www.businessbroker.net/State/Maine-Businesses_For_Sale.aspx Any help would be appreciated. Thanks, MM
Intermediate & Advanced SEO | | MWM37720 -
301 redirect or Robots.txt on an interstatial page
Hey guys, I have an affiliate tracking system that works like this : an affiliate puts up a certain code on his site, for example : www.domain.com/track/aff_id This url leads to a page where the hit is counted, analysed and then 302 redirects to my sales page with the affiliates ID in the url : www.mysalespage.com/?=aff_id. However, we've noticed recently that one affiliate seems to be ranking for our own name and the url google indexed was his tracking url (domain.com/track/aff_id). Which is strange because there is absolutely nothing on that page, its just an interstatial page so that our stats tracking software can properly filter hits. To remove the affiliate's url from showing up in the serps, I've come up with 2 solutions : 1 - Change the redirect to a 301 redirect on his track page. 2 - Change our robots.txt page to block all domain.com/track/ pages from being indexed. My question is : if I 301 redirect instead of 302, will I keep the affiliates from outranking me for my own name AND pass on link juice or should I simply block google from crawling the interstatial tracking pages?
Intermediate & Advanced SEO | | CrakJason0 -
Which page to target? Home or /landing-page
I have optimized my home page for the keyword "computer repairs" would I be better of targeting my links at this page or an additional page (which already exists) called /repairs it's possible to rename & 301 this page to /computer-repairs The only advantage I can see from targeting /computer-repairs is that the keywords are in the target URL.
Intermediate & Advanced SEO | | SEOKeith0