Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google Search console says 'sitemap is blocked by robots?
-
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt."
I don't understand why my sitemap is being blocked? My robots.txt look like this:
User-Agent: *
Disallow:It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?
-
Nice happy to hear that do you work with Greg Reindel? He is a good friend I looked at your IP that is why I ask?
Tom
-
I agree with David
Hey is your dev Greg Reindel? If so you can call me for help PM me here for my info.
Thomas Zickell
-
Hey guys, I ended up disabling the sitemap option from YoastSEO, then installed the 'Google (XML) sitemap' plug-in. I re-submitted the sitemap to Google last night, and it came back with no issues. I'm glad to finally have this sorted out.
Thanks for all the help!
-
Hi Christian,
The current robots.txt shouldn't be blocking those URLs.
Did you or someone else recently change the robots.txt file? If so, give Google a few days to re-crawl your site.
Also, can you check what happens when you do a fetch and render on one of the blocked posts in Search Console? Do you have issues there?
Cheers,
David
-
I think you need to make an https robots.txt file if you are running https if running https
https://moz.com/blog/xml-sitemaps
`User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php` Sitemap: https://domain.com/index-sitemap.xml
(that is a https site map)
can you send the sitemap URL or run it though deepcrawl
Hope this helps?
Did you make a new robots.txt file?
-
Thanks for the response. Do you think this is a robots.txt issue? Or could this be caused by the YoastSEO plugin?
Do you know if this plug-in works with YoastSEO together? Or will it cause issues?
-
Thank you for the response.
I just scanned the site using 'Screaming frog'. Under Internal>Directives there were zero 'no index' links. I also check for '404 errors', server 505 errors, or anything 'blocked by robots.txt'.
Google search console is still showing me that there are URL's being blocked by my sitemap. (I added a screenshot of this). When I click through, it tells me that the 'post sitemap' has over +300 warnings.
I have just deleted the YoastSEO plugin, and I am now re-installing it. hopefully, this fixes the issue.
-
No, you do not need to change or plug-in what is happening is Webmaster tools is telling you that you have no index or no follow were robots xTag somewhere on your URLs inside your sitemap.
Run your site through Moz, screaming frog Seo spider or deepcrawl and look for no indexed URLs.
webmaster tools/search console is telling you that you have no index URLs inside of your XML sitemap not that you robots.txt is blocking it. This would be set in the Yoast plugin. one way to correct it is to look for noindex URLs & filter them inside Yoast so they are not being presented to the crawlers.
If you would like you can turn off the sitemap on Yoast and turn it back on if that does not work I recommend completely removing the plug-in and reinstalling it
- https://kb.yoast.com/kb/how-can-i-uninstall-my-plugin/
- https://kinsta.com/blog/uninstall-wordpress-plugin/
Can you send a screenshot of what you're seeing?
When you see it in Google Webmaster tools are you talking about the XML sitemap itself mean no indexed because all XML sitemaps are no indexed.
Please add this to your robots.txt
`User-agent:* Disallow:/wp-admin/ Allow:/wp-admin/admin-ajax.php` Sitemap: http://www.website.com/sitemap_index.xml
I hope this is of help,
Tom
-
Hi,
Use this plugin
https://wordpress.org/plugins/wp-robots-txt/
it will remove previous robots.txt and set simple wordpress robots.txt and wait for a day
problem can be solved.
Also watch this video on the same @ https://www.youtube.com/watch?v=DZiyN07bbBM
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why can't google mobile friendly test access my website?
getting the following error when trying to use google mobile friendly tool: "page cannot be reached. This could be because the page is unavailable or blocked by robots.txt" I don't have anything blocked by robots.txt or robots tag. i also manage to render my pages on google search console's fetch and render....so what can be the reason that the tool can't access my website? Also...the mobile usability report on the search console works but reports very little, and the google speed test also doesnt work... Any ideas to what is the reason and how to fix this? LEARN MOREDetailsUser agentGooglebot smartphone
Technical SEO | | Nadav_W0 -
How to remove Parameters from Google Search Console?
Hi All, Following are parameter configuration in search console - Parameters - fl
Technical SEO | | adamjack
Does this parameter change page content seen by the user? - Yes, Changes, reorders, or narrows page content.
How does this parameter affect page content? - Narrow
Which URLs with this parameter should Googlebot crawl? - Let Googlebot decide (Default) Query - Actually it is filter parameter. I have already set canonical on filter page. Now I am doing tracking of filter pages via data layer and tag manager so in google analytic I am not able to see filter url's because of this parameter. So I want to delete this parameter. Can anyone please help me? Thanks!0 -
Robots.txt & meta noindex--site still shows up on Google Search
I have set up my robots.txt like this: User-agent: *
Technical SEO | | RoxBrock
Disallow: / and I have this meta tag in my on a Wordpress site, set up with SEO Yoast name="robots" content="noindex,follow"/> I did "Fetch as Google" on my Google Search Console My website is still showing up in the search results and it says this: "A description for this result is not available because of this site's robots.txt" This site has not shown up for years and now it is ranking above my site that I want to rank for this keyword. How do I get Google to ignore this site? This seems really weird and I'm confused how a site with little content, that has not been updated for years can rank higher than a site that is constantly updated and improved.1 -
Blocking Affiliate Links via robots.txt
Hi, I work with a client who has a large affiliate network pointing to their domain which is a large part of their inbound marketing strategy. All of these links point to a subdomain of affiliates.example.com, which then redirects the links through a 301 redirect to the relevant target page for the link. These links have been showing up in Webmaster Tools as top linking domains and also in the latest downloaded links reports. To follow guidelines and ensure that these links aren't counted by Google for either positive or negative impact on the site, we have added a block on the robots.txt of the affiliates.example.com subdomain, blocking search engines from crawling the full subddomain. The robots.txt file is the following code: User-agent: * Disallow: / We have authenticated the subdomain with Google Webmaster Tools and made certain that Google can reach and read the robots.txt file. We know they are being blocked from reading the affiliates subdomain. However, we added this affiliates subdomain block a few weeks ago to the robots.txt, but links are still showing up in the latest downloads report as first being discovered after we added the block. It's been a few weeks already, and we want to make sure that the block was implemented properly and that these links aren't being used to negatively impact the site. Any suggestions or clarification would be helpful - if the subdomain is being blocked for the search engines, why are the search engines following the links and reporting them in the www.example.com subdomain GWMT account as latest links. And if the block is implemented properly, will the total number of links pointing to our site as reported in the links to your site section be reduced, or does this not have an impact on that figure?From a development standpoint, it's a much easier fix for us to adjust the robots.txt file than to change the affiliate linking connection from a 301 to a 302, which is why we decided to go with this option.Any help you can offer will be greatly appreciated.Thanks,Mark
Technical SEO | | Mark_Ginsberg0 -
How to remove my cdn sub domins on Google search result?
A few months ago I moved all my Wordpress images into a sub domain. After I purchased CDN service, I again moved that images to my root domain. I added User-agent: * Disallow: / to my CDN domain. But now, when I perform site search on the Google, I found that my CDN sub domains are indexed by the Google. I think this will make duplicate content issue. I already hit by the Panguin. How do I remove these search results on Google? Should I add my cdn domain to webmaster tools to request URL removal request? Problem is, If I use cdn.mydomain.com it shows my www.mydomain.com. My blog:- http://goo.gl/58Utt site search result:- http://goo.gl/ElNwc
Technical SEO | | Godad1 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Why are Google search results different if you are log'd into Google or not?
I get different results when I'm log'd into my Google account associated with my website than if I'm not. The same country is occurring. So how can I rely on the google results I'm seeing? For instance my site is page 1 with the improvements I made based on SEOMOZ if I'm log'd in. Yet I'm not on the first 25 pages if I'm not logged in.
Technical SEO | | Romana0 -
NoIndex/NoFollow pages showing up when doing a Google search using "Site:" parameter
We recently launched a beta version of our new website in a subdomain of our existing site. The existing site is www.fonts.com with the beta living at new.fonts.com. We do not want Google to crawl the new site until it's out of beta so we have added the following on all pages: However, one of our team members noticed that google is displaying results from new.fonts.com when doing an "site:new.fonts.com" search (see attached screenshot). Is it possible that Google is indexing the content despite the noindex, nofollow tags? We have double checked the syntax and it seems correct except the trailing "/". I know Google still crawls noindexed pages, however, the fact that they're showing up in search results using the site search syntax is unsettling. Any thoughts would be appreciated! DyWRP.png
Technical SEO | | ChrisRoberts-MTI0