Sitemap Contains Blocked Resources
-
Hey Mozzers,
I have several pages on my website that are for user search purposes only. They sort some products by range and answer some direct search queries users type into the site. They are basically just product collections that are else ware grouped in different ways.
As such I didn't wants SERPS getting their hands on them so blocked them in robots so I could add then worry free. However, they automatically get pulled into the sitemap by Magento.
This has made Webmaster tools give me a warning that 21 urls in the sitemaps are blocked by robots.
Is this terrible SEO wise?
Should I have opted to NOINDEX these URLS instead? I was concerned about thin content so really didnt want google crawling them.
-
Thanks for the latest responses guys
I have researched it into the grave and it the way Magento generates the sitemap makes it impossible for me to exclude these URLS.
I will just unblock them from robots, and make them all noindex. This seems to solve all problems, i will then block them when im 100% sure they are unindexed.
Thanks Again chaps.
Big help as always.
-
OK so first because some are indexed, if you block access, they will never be removed.
What you will need to do is add a noindex tag to the pages but don't block access to them so that Google can honour the noindex. Remove the pages via Search Console and once you have confirmed these are all removed from the index, you will be good to then block access via robots.txt.
As CleverPhD said, ideally you don't want pages in the index that can't be crawled, but it isn't likely to cause a penalty of any sort (I have a client with about 70-80 blocked - long story - no issues in 12 months) if you are stuck because of Megento - Perhaps research to see how others have got around this?
-Andy
-
I would recommend that you try and get those pages out of your sitemap. If you look through the Google sitemap best practices, it states that the sitemap should be for pages that Googlebot can access.
http://googlewebmastercentral.blogspot.com/2014/10/best-practices-for-xml-sitemaps-rssatom.html
URLs
URLs in XML sitemaps and RSS/Atom feeds should adhere to the following guidelines:
- Only include URLs that can be fetched by Googlebot. **A common mistake is **including URLs disallowed by robots.txt — which cannot be fetched by Googlebot, or including URLs of pages that don't exist.
-
Hi Andy,
I just checked and yes they were previously index'd and some of them still are.
-
Hi,
Is this terrible SEO wise?
Not really - it just means that Google can see that there is a page they can't access so are informing you of this. There is no negative penalty that is going to come from this. If there were old pages that are now 404's then it would be a different story.
I just want to be sure of something - were the pages previously open to Google? Are they currently indexed?
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap error in Webmaster tools - 409 error (conflict)
Hey guys, I'm getting this weird error when I submit my sitemap to Google. It says I'm getting a 409 error in my post-sitemap.xml file (https://cleargear.com/post-sitemap.xml). But when I check it, it looks totally fine. I am using YoastSEO to generate the sitemap.xml file. Has anyone else experienced this? Is this a big deal? If so, Does anyone know how to fix? Thanks EwTswL4
Technical SEO | | Extima-Christian0 -
No structured sitemap
Hello We face this problem that a lot of sitemaps are structurally not good. In this case we used the WP sitemap plugin to generate the website sitemap and Google XML sitemaps to generate the sitemap for Google. We also bought the Yoast premium plugin, but we can read in the backend that the plugin XML sitemaps may cause problems in combination with Yoast. Normally the Google XML sitemap generator improves SEO using sitemaps for the best indexation by search engines, but the structure is not as we want it. Will Yoast be a better solution to generate structured sitemaps? This is a section from the current sitemap of www.rovana.be. Products Reepgordijn Plissé - Dupli gordijn Duo rolgordijn Paneelgordijn Jaloezie - Vlinderjaloezie Poorten Muggenramen Velux accessoires Rolgordijn Vouwgordijn Buitenjaloezie Voorzetrolluik Glasdak Glaswand Vouwdak Pergola Verlichting - Verwarming Automatisering Lamellendak Verandazonwering Screens Koepel zonwering This is how we think the sitemap should look like. We would like more structure in the different product categories. Producten Zonwering Zonnescherm
Technical SEO | | conversal
Screens
Verandazonwering
Koepel zonwering
Automatisering
Verwarming – verlichting Terrasoverkapping Lamellendak
Pergola
VouwdaK
Glasdak
Glaswand Raamdecoratie Rolgorijn
Paneelgordijn
Duo rolgordijn
Vouwgordijn
Plissé – dupli gordijn
Jaloezie – vlinderjaloezie
Reepgordijn
Velux accessoires Rolluiken Voorzetrolluiken
Buitenjaloezie
Velux accessoires Muggenramen Muggenraam
Velux accessoires Poorten Sectionaal poort Is this technically possible to create similar sitemaps in WordPress and how exactly do we proceed here? What is the impact of these changes on SEO? How can we make this work? Thanks!0 -
Despite proper hreflang and lang attribute implementation using xml sitemaps, I'm seeing sitelinks from different countries. Any help please?
When someone searches for our brand in US, instead of only US links, users are served with canadian or iranian sitelinks. Despite we have properly implemented xml sitemaps with hreflangs, even we have implemented lang attribute in the head section of source code for every country. I'd be thankful for any advice.
Technical SEO | | eset0 -
301 Redirects, Sitemaps and Indexing - How to hide redirected urls from search engines?
We have several pages in our site like this one, http://www.spectralink.com/solutions, which redirect to deeper page, http://www.spectralink.com/solutions/work-smarter-not-harder. Both urls are listed in the sitemap and both pages are being indexed. Should we remove those redirecting pages from the site map? Should we prevent the redirecting url from being indexed? If so, what's the best way to do that?
Technical SEO | | HeroDesignStudio0 -
What are the steps to submitting a sitemap for a blog?
We are in the process of a website migration and need to submit a site map for our website and blog. What are the steps to follow for submitting a site map for the blog? Can we submit with just the /blog URL or do we need to include each category?
Technical SEO | | Sable_Group0 -
Meta data & xml sitemaps for mobile sites when using rel="canonical"/rel="alternate" annotations
When using rel="canonical" and rel="alternate" annotations between mobile and desktop sites (rel="canonical" on mobile, pointing to desktop, and rel="alternate" on desktop pointing to mobile), what are everyone's thoughts on using meta data on the mobile site? Is it necessary? And also, what is the common consensus on using a separate mobile xml sitemap?
Technical SEO | | 4Ps0 -
Children in this Sitemap index Warnings
Hi, I have just submitted a sitmap for one website. But I am getting this warning: Number of children in this Sitemap index 3
Technical SEO | | knockmyheart
Sitemap contains urls which are blocked by robots.txt.Sitemap: www.zemtube.com/videoscategory-sitemap.xmlValue: http://www.zemtube.com/videoscategory/exclusive/www.zemtube.com/videoscategory-sitemap.xmlValue: http://www.zemtube.com/videoscategory/featured/www.zemtube.com/videoscategory-sitemap.xmlValue: http://www.zemtube.com/videoscategory/other/It is a wordpress website and the robots.txt file is:# Exclude Files From All Robots: User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /tag/ End robots.txt file#I have also tried adding this to the robots.txtSitemap: http://www.zemtube.com/sitemap_index.xmlWebmaster-Tools-Sitemaps-httpwww.zemtube.com_.pdf0 -
Hosting sitemap on another server
I was looking into XML sitemap generators and one that seems to be recommended quite a bit on the forums is the xml-sitemaps.com They have a few versions though. I'll need more than 500 pages indexed, so it is just a case of whether I go for their paid for version and install on our server or go for their pro-sitemaps.com offering. For the pro-sitemaps.com they say: "We host your sitemap files on our server and ping search engines automatically" My question is will this be less effective than my installing it on our server from an SEO perspective because it is no longer on our root domain?
Technical SEO | | design_man0