Google Search console says 'sitemap is blocked by robots?
-
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt."
I don't understand why my sitemap is being blocked? My robots.txt look like this:
User-Agent: *
Disallow:It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?
-
Nice happy to hear that do you work with Greg Reindel? He is a good friend I looked at your IP that is why I ask?
Tom
-
I agree with David
Hey is your dev Greg Reindel? If so you can call me for help PM me here for my info.
Thomas Zickell
-
Hey guys, I ended up disabling the sitemap option from YoastSEO, then installed the 'Google (XML) sitemap' plug-in. I re-submitted the sitemap to Google last night, and it came back with no issues. I'm glad to finally have this sorted out.
Thanks for all the help!
-
Hi Christian,
The current robots.txt shouldn't be blocking those URLs.
Did you or someone else recently change the robots.txt file? If so, give Google a few days to re-crawl your site.
Also, can you check what happens when you do a fetch and render on one of the blocked posts in Search Console? Do you have issues there?
Cheers,
David
-
I think you need to make an https robots.txt file if you are running https if running https
https://moz.com/blog/xml-sitemaps
`User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php` Sitemap: https://domain.com/index-sitemap.xml
(that is a https site map)
can you send the sitemap URL or run it though deepcrawl
Hope this helps?
Did you make a new robots.txt file?
-
Thanks for the response. Do you think this is a robots.txt issue? Or could this be caused by the YoastSEO plugin?
Do you know if this plug-in works with YoastSEO together? Or will it cause issues?
-
Thank you for the response.
I just scanned the site using 'Screaming frog'. Under Internal>Directives there were zero 'no index' links. I also check for '404 errors', server 505 errors, or anything 'blocked by robots.txt'.
Google search console is still showing me that there are URL's being blocked by my sitemap. (I added a screenshot of this). When I click through, it tells me that the 'post sitemap' has over +300 warnings.
I have just deleted the YoastSEO plugin, and I am now re-installing it. hopefully, this fixes the issue.
-
No, you do not need to change or plug-in what is happening is Webmaster tools is telling you that you have no index or no follow were robots xTag somewhere on your URLs inside your sitemap.
Run your site through Moz, screaming frog Seo spider or deepcrawl and look for no indexed URLs.
webmaster tools/search console is telling you that you have no index URLs inside of your XML sitemap not that you robots.txt is blocking it. This would be set in the Yoast plugin. one way to correct it is to look for noindex URLs & filter them inside Yoast so they are not being presented to the crawlers.
If you would like you can turn off the sitemap on Yoast and turn it back on if that does not work I recommend completely removing the plug-in and reinstalling it
- https://kb.yoast.com/kb/how-can-i-uninstall-my-plugin/
- https://kinsta.com/blog/uninstall-wordpress-plugin/
Can you send a screenshot of what you're seeing?
When you see it in Google Webmaster tools are you talking about the XML sitemap itself mean no indexed because all XML sitemaps are no indexed.
Please add this to your robots.txt
`User-agent:* Disallow:/wp-admin/ Allow:/wp-admin/admin-ajax.php` Sitemap: http://www.website.com/sitemap_index.xml
I hope this is of help,
Tom
-
Hi,
Use this plugin
https://wordpress.org/plugins/wp-robots-txt/
it will remove previous robots.txt and set simple wordpress robots.txt and wait for a day
problem can be solved.
Also watch this video on the same @ https://www.youtube.com/watch?v=DZiyN07bbBM
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Discrepancy between Search Console & LightHouse - CLS shift
Curious if anyone else is having this problem. I have, for example, a page that is listed in Search Console as having a CLS of .44 - it is listed as a "CLS issue." The same page rendered in LightHouse shows 0 for field data CLS and 0.02 for lab data (both in the "green"). It has been over a month since I made updates to the page to improve CLS. I tried to submit a validation in Search Console, but "validation failed." I'm not sure what else to fix on the page when LightHouse data shows it as in the green! I have the same issue with other pages as well.
Technical SEO | | LivDetrick0 -
Google Search Console - Sitemap
Hi all, Quick question. I'm trying to update my sitemap via Google Search Console using a sitemap.xml file that I've created with ScreamingFrog. However, when trying to submit it, it seems that Google only allows sitemaps that are located at a path within your domain (i.e. www.example.com/sitemap.xml) as opposed to being able to directly upload a sitemap.xml file.Is there any way that I can easily upload my sitemap.xml file? Or is there any easy way that I can upload the file to a path on my domain so I can upload via the URL?Any insight would be much appreciated!Best,Sung
Technical SEO | | hdeg0 -
Google's ability to crawl AJAX rendered content
I would like to make a change to the way our main navigation is currently rendered on our e-commerce site. Currently, all of the content that appears when you click a navigation category is rendering on page load. This is currently a large portion of every page visit’s bandwidth and even the images are downloaded even if a user doesn’t choose to use the navigation. I’d like to change it so the content appears and is downloaded only IF the user clicks on it, I'm planning on using AJAX. As that is the case it wouldn’t not be automatically on the site(which may or may not mean Google would crawl it). As we already provide a sitemap.xml for Google I want to make sure this change would not adversely affect our SEO. As of October this year the Webmaster AJAX crawling doc. suggestions has been depreciated. While the new version does say that its crawlers are smart enough to render AJAX content, something I've tested, I'm not sure if that only applies to content injected on page load as opposed to in click like I'm planning to do.
Technical SEO | | znotes0 -
Robots.txt blocking site or not?
Here is the robots.txt from a client site. Am I reading this right --
Technical SEO | | 540SEO
that the robots.txt is saying to ignore the entire site, but the
#'s are saying to ignore the robots.txt command? See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file To ban all spiders from the entire site uncomment the next two lines: User-Agent: * Disallow: /0 -
Just relaunched a website - why did it fall in Google's SERPs?
I work for a marketing agency that just redesigned, rewrote and relaunched a client's website. They used to rank #4 on Google for the company's name (which is a fairly common one, for what it's worth). Now they're at #10 and want to know why. I'd like to explain to them what happened but don't know myself. Can someone explain it to me? And can I tell them if/when their ranking might go back up? In case this matters, I can tell you that it looks like Google hasn't yet crawled the new site. Anyway, thanks in advance for any help you can provide.
Technical SEO | | matt-145670 -
What Google uses in search result descriptions
Recently, Google has started including certain information from our web pages in their search results description that is a bit puzzling. For example if you google 'Wedding Band Raleigh' the description they are using for our site's (GigMasters) page begins with the text 'Results 1 - 10 of 1005' Not sure why they are pulling that information. That is in on the page but its not high up on the page or marked with any special h1, h2, or h3 tag. We do have that information inside of a div which we have named 'Results'. Maybe that's why? Did we inadvertently use some sort of Google rich snippet or schema.org naming convention?! Any insight would be hugely appreciated.
Technical SEO | | gigmasters0 -
Google shows the wrong domain for client's homepage
Whenever the homepage of my client's homepage appears in Google results, the search engine is not showing our URL as our domain, but instead a partner domain that is linking to us. (The correct title and meta description of our homepage is showing.) I believe this is caused by the partner website (with a much higher pank rank) linking to our homepage from their footer to a URL with it's own domain that 302 redirects to our homepage. Example: Link: http://www.partnerwebsite.com/?ad2203 302 redirects to: http://www.clientwebsite.com/?moreadtracking The simple fix would be for the client to ask for removal of the 302 hijacking link - but they are uncomfortable with this request since they had requested it prior, and their relationship is not the best. Is there any other way to fix this?
Technical SEO | | Conor_OShea_ETUS0 -
Google has not indexed my site in over 4 weeks, what's the problem?
We recently put in permanent redirects to our new url, but Google seems to not want to index the new url. There was no problems with the old url and the new url is brand new so should have no 'black marks' against it. We have done everything we can think off in terms of submitting site maps, telling google our url has changed in webmaster tools, mentioning the new url on social sites etc...but still nothing. It has been over 4 weeks now since we set up the redirects to the url, any ideas why Google seems to be choosing not to index it? Thanks
Technical SEO | | cewe0