What are the negative implications of listing URLs in a sitemap that are then blocked in the robots.txt?
-
In running a crawl of a client's site I can see several URLs listed in the sitemap that are then blocked in the robots.txt file.
Other than perhaps using up crawl budget, are there any other negative implications?
-
I highly doubt it would effect rankings due to low quality issues but it will show that you have site map error warnings in your GWT console. That issue is technically classified as 'Warnings' and not 'Errors'. The right thing to do in that scenario is take the robots.txt block off and just use a 'noindex' tag on the pages. That way they can stay in the site map but they won't show up in the index. Otherwise you should remove them from the sitemap if you don't want the warnings in GWT.
-
I personally do not think there is any penalty SEO wise in doing it. Although, I do think it will mess up the metric in GWT that shows how many pages have been submitted and how many have been indexed. I find that metric useful, so it would make it no longer useful if there are a lot of pages blocked by the robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New URL Structure
Hi Guy's, For our webshop we're considering a new URL structure because longtail keywords to rank so well. Now we have /category (main focus keywords)
Technical SEO | | Happy-SEO
/product/the-product345897345123/ (nice to rank on, not that much volume) We have over 500 categories and every one of them is placed after our domain. Because i think it's better to work with a good structure and managed a way to make categories and sub-categories. The 500 categories may be the case why not every one of them is ranking so well, so that was also the choice of thinking about a new structure. So the new URL structure will be: /category (main focus keywords)
/category/subcat/ (also main focus keywords) Everything will be redirect (301, good way), so i think there won't be to much problems. I'm thinking about what to do with the /product/ URL. Because now it will be on the same level as the subcategories, and i'm affraid that when it's on that level, Google will give the same value to both of them. My options that i'm considering are: **Old way **
/product/the-product-345897345123/ .html (seen this on big webshops)
/product/the-product-345897345123.html/ Level deeper SKU /product/the-product/345897345123/ What would you suggest? The new structure would be 20 categories 500+ sub's devided under main categories 5000+ products Thanks!0 -
Redirect_to in URLs?
I've never seen this before. I'm assuming that it's not SEO friendly and that these should be 301s or 302s instead? http://ksa-beta.motory.com/ar/login/?redirect_to=http://ksa-beta.motory.com/ar/cars-for-sale-search/results/central/riyadh/ford/explorer/2010/ford-explorer-2010-1038353 http://ksa-beta.motory.com/ar/login/?redirect_to=http://ksa-beta.motory.com/ar/account/my-saved-searches/
Technical SEO | | KatherineWatierOng0 -
Is there a maximum sitemap size?
Hi all, Over the last month we've included all images, videos, etc. into our sitemap and now its loading time is rather high. (http://www.troteclaser.com/sitemap.xml) Is there any maximum sitemap size that is recommended from Google?
Technical SEO | | Troteclaser0 -
# in url affecting rank
Hi I am building links to a page www.companyname.com/category.index.php There is also another similar url www.companyname.com/category.index.php#. This page is linked to from the non # page. This is a new client and I'm not entirely sure why that link is there. Am I correct in thinking that these two urls are different in the eyes of the search engines? If so, would some of the link juice to www.companyname.com/category.index.php be transferred to www.companyname.com/category.index.php# and affect the ranking of the non # page? I hope this makes sense! Thanks
Technical SEO | | sicseo0 -
Magento Robots & overly dynamic URL-s
How can i block all URL-s on a Magento store that have 2 or more dynamic parameters in it, since all the parameters have attribute name in it and not some uniform ID Would something like: Disallow: /?&* work? Since the only thing that is constant throughout all the custom parameters is that they are separated with "&" Thanks 🙂
Technical SEO | | tilenkrivec0 -
What does it mean by 'blocked by Meta Robot'? How do I fix this?
When i get my crawl diagnostics, I am getting a blocked by Meta Robot, which means that my page is not being indexed in the search engines... obviously this is a major issue for organic traffic!!! What does it actually mean, and how can i fix it?
Technical SEO | | rolls1230 -
Are URL's with trailing slash seen as two different URLs
Hello, http://www.example.com and http://ww.example.com/ Are these seen as two different URL's ? Just as with www or non www ? Or it doesn't make any difference ?
Technical SEO | | seoug_20050 -
Need Help With Robots.txt on Magento eCommerce Site
Hello, I am having difficulty getting my robots.txt file to be configured properly. I am getting error emails from Google products stating they can't view our products because they are being blocked, and this past week, in my SEO dashboard, the URL's receiving search traffic dropped by almost 40%. Is there anyone that can offer assistance on a good template robots.txt file I can use for a Magento eCommerce website? The one I am currently using was found at this site here: e-commercewebdesign.co.uk/blog/magento-seo/magento-robots-txt-seo.php - However, I am getting problems from Google now because of it. I searched and found this thread here: http://www.magentocommerce.com/wiki/multi-store_set_up/multiple_website_setup_with_different_document_roots#the_root_folder_robots.txt_file - But I felt like maybe I should get some additional help on properly configuring a robots for a Magento site. Thanks in advance for any help. Please, let me know if you need more info to provide assistance.
Technical SEO | | JerDoggMckoy0