Sitemap Contains Blocked Resources
-
Hey Mozzers,
I have several pages on my website that are for user search purposes only. They sort some products by range and answer some direct search queries users type into the site. They are basically just product collections that are else ware grouped in different ways.
As such I didn't wants SERPS getting their hands on them so blocked them in robots so I could add then worry free. However, they automatically get pulled into the sitemap by Magento.
This has made Webmaster tools give me a warning that 21 urls in the sitemaps are blocked by robots.
Is this terrible SEO wise?
Should I have opted to NOINDEX these URLS instead? I was concerned about thin content so really didnt want google crawling them.
-
Thanks for the latest responses guys
I have researched it into the grave and it the way Magento generates the sitemap makes it impossible for me to exclude these URLS.
I will just unblock them from robots, and make them all noindex. This seems to solve all problems, i will then block them when im 100% sure they are unindexed.
Thanks Again chaps.
Big help as always.
-
OK so first because some are indexed, if you block access, they will never be removed.
What you will need to do is add a noindex tag to the pages but don't block access to them so that Google can honour the noindex. Remove the pages via Search Console and once you have confirmed these are all removed from the index, you will be good to then block access via robots.txt.
As CleverPhD said, ideally you don't want pages in the index that can't be crawled, but it isn't likely to cause a penalty of any sort (I have a client with about 70-80 blocked - long story - no issues in 12 months) if you are stuck because of Megento - Perhaps research to see how others have got around this?
-Andy
-
I would recommend that you try and get those pages out of your sitemap. If you look through the Google sitemap best practices, it states that the sitemap should be for pages that Googlebot can access.
http://googlewebmastercentral.blogspot.com/2014/10/best-practices-for-xml-sitemaps-rssatom.html
URLs
URLs in XML sitemaps and RSS/Atom feeds should adhere to the following guidelines:
- Only include URLs that can be fetched by Googlebot. **A common mistake is **including URLs disallowed by robots.txt — which cannot be fetched by Googlebot, or including URLs of pages that don't exist.
-
Hi Andy,
I just checked and yes they were previously index'd and some of them still are.
-
Hi,
Is this terrible SEO wise?
Not really - it just means that Google can see that there is a page they can't access so are informing you of this. There is no negative penalty that is going to come from this. If there were old pages that are now 404's then it would be a different story.
I just want to be sure of something - were the pages previously open to Google? Are they currently indexed?
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap.gz is being indexed and is showing up in SERP instead of actual pages.
Sitemap.gz is being indexed and is showing up in SERP instead of actual pages. I recently uploaded my sitemap file - https://psglearning.com/sitemapcustom/sitemap-index.xml - via Search Console. The only record within the XML file is sitemaps.gz. When I searched for some content on my site - here is the search https://goo.gl/mqxBeq - I was shown the following search result, indicating that our GZ file is getting indexed instead of our pages. http://www.psglearning.com/catalog 1 http://www.psglearning.com ...www.psglearning.com/sitemapcustom/sitemap.gz... 1 https://www.psglearning.com/catalog/productdetails/9781284059656/ 1 https://www.psglearning.com/catalog/productdetails/9781284060454/ 1 ... My sitemap is listed at https://psglearning.com/sitemapcustom/sitemap-index.xml inside the sitemap the only reference is to sitemap.gz. Should we remove the link the the sitemap.gz within the xml file and just serve the actual page paths? <sitemapindex< span=""> xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"></sitemapindex<><sitemap></sitemap>https://www.psglearning.com/sitemapcustom/sitemap.gz<lastmod></lastmod>2017-06-12T09:41-04:00
Technical SEO | | pdowling0 -
Sitemap duplicate title
At the moment we have a html sitemap which is pulling the same h1's/ titles. How big a problem is the duplicate content issue which is medium priority in the moz pro softaware? Would you recommend changes as sitemap page 1 - page 2 etc. Thanks
Technical SEO | | VUK-SEO0 -
Multiple Sitemaps
Hello everyone! I am in the process of updating the sitemap of an ecommerce website and I was thinking to upload three different sitemaps for different part (general/categories and subcategories/productgroups and products) of the site in order to keep them easy to update in the future. Am I allowed to do so? would that be a good idea? Open to suggestion 🙂
Technical SEO | | PremioOscar0 -
Will sitemap generated in Yoast for a combined wordpress/magento site map entire site ?
Hi For an ecommerce site thats been developed via a combination of wordpress and magento and has yoast installed, will the sitemap (& other yoast features) map (& apply to) the entire site or just wordpress aspects ? In other words does one need to do anything else to have a full sitemap for a combined magento/wordpress site or will Yoast cover it all ? This link seems to suggest should be fine but seeing if anyone else encountered this and had problems or if straightforward ? http://fishpig.co.uk/wordpress-integration/docs/plugins.html cheers dan
Technical SEO | | Dan-Lawrence0 -
Differences in Sitemaps SEO wise?
I'm a bit confused about sitemaps. I'm just learning SEO so forgive me if this is a basic question. I've submitted my site to google webmaster using http://pro-sitemaps.com and the sitemap generator it creates. I've also seen sites do this: http://www.johnlewis.com/Shopping/ProductList.aspx and http://www.thesafestcandles.com/site-map.html so I did something similar for my site (www.ldnwicklesscandles.com). You figure you see everyone do it you might as well try it too and hope it works. 😉 So I've done both 1 and 2. Which sitemap is best for SEO purposes or should I do both? Is there any format that should or shouldn't be used for Option 2? Any site examples for good practice would be helpful.
Technical SEO | | cmjolley0 -
How to block "print" pages from indexing
I have a fairly large FAQ section and every article has a "print" button. Unfortunately, this is creating a page for every article which is muddying up the index - especially on my own site using Google Custom Search. Can you recommend a way to block this from happening? Example Article: http://www.knottyboy.com/lore/idx.php/11/183/Maintenance-of-Mature-Locks-6-months-/article/How-do-I-get-sand-out-of-my-dreads.html Example "Print" page: http://www.knottyboy.com/lore/article.php?id=052&action=print
Technical SEO | | dreadmichael0 -
Why would a link shown on OSE appear differently than the page containing the link?
I recently traded links with a site that I will call www.example.com When I used open site explorer to check the link it came back with a different page authority as www.example.com/index.htm yet the link does appear on the www.example.com page. Why would this be?
Technical SEO | | casper4340 -
Partial mobile sitemap
Hi, We have a main www website with a standard sitemap. We also have a m. site for mobile content (but m. is only for our top pages and doesn't include the entire site). If a mobile client accesses one of our www pages we redirect to the m. page. If we don't have a m. version we keep them on the www site. Currently we block robots from the mobile site. Since our m. site only contains the top pages, I'm trying to determine the boost we might get from creating a mobile sitemap. I don't want to create the "partial" mobile sitemap and somehow have it hurt our traffic. Here is my plan update m. pages to point rel canonical to appropriate www page (makes sure we don't dilute SEO across m. and www.) create mobile sitemap and allow all robots to access site. Our www pages already rank fairly highly so just want to verify if there are any concerns since m. is not a complete version of www?
Technical SEO | | NicB10