Sitemap Contains Blocked Resources
-
Hey Mozzers,
I have several pages on my website that are for user search purposes only. They sort some products by range and answer some direct search queries users type into the site. They are basically just product collections that are else ware grouped in different ways.
As such I didn't wants SERPS getting their hands on them so blocked them in robots so I could add then worry free. However, they automatically get pulled into the sitemap by Magento.
This has made Webmaster tools give me a warning that 21 urls in the sitemaps are blocked by robots.
Is this terrible SEO wise?
Should I have opted to NOINDEX these URLS instead? I was concerned about thin content so really didnt want google crawling them.
-
Thanks for the latest responses guys
I have researched it into the grave and it the way Magento generates the sitemap makes it impossible for me to exclude these URLS.
I will just unblock them from robots, and make them all noindex. This seems to solve all problems, i will then block them when im 100% sure they are unindexed.
Thanks Again chaps.
Big help as always.
-
OK so first because some are indexed, if you block access, they will never be removed.
What you will need to do is add a noindex tag to the pages but don't block access to them so that Google can honour the noindex. Remove the pages via Search Console and once you have confirmed these are all removed from the index, you will be good to then block access via robots.txt.
As CleverPhD said, ideally you don't want pages in the index that can't be crawled, but it isn't likely to cause a penalty of any sort (I have a client with about 70-80 blocked - long story - no issues in 12 months) if you are stuck because of Megento - Perhaps research to see how others have got around this?
-Andy
-
I would recommend that you try and get those pages out of your sitemap. If you look through the Google sitemap best practices, it states that the sitemap should be for pages that Googlebot can access.
http://googlewebmastercentral.blogspot.com/2014/10/best-practices-for-xml-sitemaps-rssatom.html
URLs
URLs in XML sitemaps and RSS/Atom feeds should adhere to the following guidelines:
- Only include URLs that can be fetched by Googlebot. **A common mistake is **including URLs disallowed by robots.txt — which cannot be fetched by Googlebot, or including URLs of pages that don't exist.
-
Hi Andy,
I just checked and yes they were previously index'd and some of them still are.
-
Hi,
Is this terrible SEO wise?
Not really - it just means that Google can see that there is a page they can't access so are informing you of this. There is no negative penalty that is going to come from this. If there were old pages that are now 404's then it would be a different story.
I just want to be sure of something - were the pages previously open to Google? Are they currently indexed?
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Facebook widget and blocked images
A Wordpress site has a footer widget for facebook with some images, all of which are served within an iframe. The FB CDN robots is blocking the images from being crawled so Webmaster Tools rendering tool is reporting these 8 or so images as blocked. Should I be concerned?
Technical SEO | | MickEdwards0 -
Redirecting old Sitemaps to a new XML
I've discovered a ton of 404s from Google's WMT crawler looking for mydomain.com/sitemap_archive_MONTH_YEAR. There are tons of these monthly archive xmls. I've used a plugin that for some reason created individual monthly archive xml sitemaps and now I get 404s. Creating rules for each archive seems a bad solution. My current sitemap plugin creates a single clean one mydomain.com/sitemap_index.xml. How can I create a redirect rule in the Redirection WP plugin that will redirect any URL that has the 'sitemap' and 'xml' string in it to my current xml sitemap? I've tried using a wildcard like so: mysite.com/sitemap*.*, mysite.com/sitemap ., mysite.com/sitemap(.), mysite.com/sitemap (.) but none of the wildcard uses got the general redirect to work. Is there a way to make this happen with the WP Redirection plugin? If not, is there a htaccess rule, and what would the code be for it? Im not very fluent with using general redirects in htaccess unfortunately. Thanks!
Technical SEO | | IgorMateski0 -
Linking domains on the same C Block together
Hey, I have an online store selling dj equipment, sound & light products such as speakers, lasers, decks, pa systems, karaoke systems etc. I just bought a new domain but I registered it under a different name and address (my personal details). And I plan on hosting the website on a seperate server so it has no connection with my eCommerce store. The main purpose of the website will be to review the products I sell, write detailed how to guides for DJ's, party planners, mobile DJ's etc. There will be links on the current ecommerce website (which currently gets around anything from 500 to 1000 unique hits a day) going to the new blog website. But would I be better off keeping it on the same C Block even though they are going to be two very different websites and the blog may not always necessarily be about the products on my ecommerce website and may be products on say eBay, Amazon, etc. (In otherwords, it's going to be it's own website with an unbiased opinion, but the ecommerce site will be linking to it on certain products that are reviewed on there). Any help is appreciated 🙂
Technical SEO | | tomhall900 -
Help - we're blocking SEOmoz cawlers
We have a fairly stringent blacklist and by the looks of our crawl reports we've begin unintentionally blocking the SEOmoz crawler. can you guys let me know the useragent string and anything else I need to enable mak sure you're crawlers are whitelisted? Cheers!
Technical SEO | | linklater0 -
Do Seomozers recommend sitemaps.xml or not. I'm thoroughly confused now. The more I read, the more conflicted I get
I realize I'm probably opening a can of worms, but here we go. Do you or do you not add a sitemap.xml to a clients site?
Technical SEO | | catherine-2793880 -
Sitemap Creation + Site speed
Hi there, I am looking for a sitemap creation tool, so I can submit my site to Google. My site is www.vallnord.com On the other hadn I would like to speed up my web. Any tip? Regards, Guido.
Technical SEO | | SilbertAd0 -
When is the best time to submit a sitemap?
What changes to a website constitute resubmitting a sitemap? For example, if I add new in-site links, should I then resubmit? Or is it more for changes to URLs, Page titles, etc?
Technical SEO | | MichaelWeisbaum0