Include or exclude noindex urls in sitemap?
-
We just added tags to our pages with thin content.
Should we include or exclude those urls from our sitemap.xml file? I've read conflicting recommendations.
-
Hi vcj and the rest of you guys
I would be very interested in learning what strategy you actually went ahead with, and the results. I have a similar issue as a result of pruning, and removing noindex pages from the sitemap makes perfect sense to me. We set a noindexed follow on several thousand pages without product descriptions/thin content and we have set things up so when we add new descriptions and updated onpage elements, the noindex is automatically reversed; which sounds perfect, however hardly any of the pages to date (3000-4000) are indexed, so looking for a feasible solution for exactly the same reasons as you.
We have better and comparable metrics and optimization than a lot of the competition, yet rankings are mediocre, so looking to improve on this.
It would be good to hear your views
Cheers
-
I'm aware of the fact Google will get to them sooner or later.
The recommendation from Gary Illyes (from Google), as mentioned in this post, was the reason for my asking the question. Not trying to outsmart Google, just trying to work within their guidelines in the most efficient way possible.
-
Just to put things into perspective,
if these URLs are all already indexed and you have used "noindex" on those pages, sooner or later google will re-crawl these pages and they will be removed. You may want to remove them from the index ASAP for some reason, but it wont really change anything. Because Google will not deindex your noindex pages just because they are in your sitemap.xml.
Google deindexes a sie only when it is time to re-crawl the page.Google never recommends using noindex in sitemaps, and google wont suggest that in their blocking search indexing results guidelines. Also Google indicates the following:
"Google will completely drop the page from search results, even if other pages link to it. If the content is currently in our index, we will remove it after the next time we crawl it. (To expedite removal, use the Remove URLs tool in Google Webmaster Tools.)"But hey! every SEO has its own take.. Some tend to try outsmart Google some not..
Good luck
-
That opens up other potential restrictions to getting this done quickly and easily. I wouldn't consider it best practices to create what is essentially a spam page full of internal links and Googlebot will likely not crawl all 4000 links if you have them all there. So now you'd be talking about maybe making 20 or so thin, spammy looking pages of 200+ internal links to hopefully fix the issue.
The quick, easy sounding options are not often the best option. Considering you're doing all of this in an attempt to fix issues that arose due to an algorithmic penalty, I'd suggest trying to follow best practices for making these changes. It might not be easy but it'll lessen your chances of having done a quick fix that might be the cause, or part of, a future penalty.
So if Fetch As won't work for you (considering lack of manpower to manually fetch 4000 pages), the sitemap.xml option might be the better choice for you.
-
Thanks, Mike.
What are your thoughts on creating a page with links to all of the pages we've Noindexed, doing a Fetch As and submitting that URL and its linked pages? Do you think Google would dislike that?
-
You could technically add them to the sitemap.xml in the hopes that this will get them noticed faster but the sitemap is commonly used for the things you want Google to crawl and index. Plus, placing them in the sitemap does not guarantee Google is going to get around to crawling your change or those specific pages. Technically speaking, doing nothing and jut waiting is equally as valid. Google will recrawl your site at some point. Sitemap.xml only helps if Google is crawling you to see it. Fetch As makes Google see your page as it is now which is like forcing part of a crawl. So technically Fetch As will be the more reliable, quicker choice though it will be more labor-intensive. If you don't have the man-hours to do a project like that at the moment, then waiting or using the Sitemap could work for you. Google even suggests using Fetch As for urls you want them to see that you have blocked with meta tags: https://support.google.com/webmasters/answer/93710?hl=en&ref_topic=4598466
-
There are too many pages to do that (unless we created a page with links to all of the Noindexed pages, then asked Google to crawl that and all linked pages, though that seems like it might be a bad approach). It's an ecommerce website and we Noindexed nearly 4,000 pages that had thin or duplicate content (manufacturer descriptions, no description on brand page, etc) and had no organic traffic in the past 90 days.
This site was hit by Panda in September 2014 and isn't ranking for things it should be – pages with better backlink profiles, higher DA/PA, better content, etc. than our competitors. Our thought is we're not ranking because of a penalty against thin/duplicate content. So we decided to Noindex these pages, improve the content on products that are selling and getting traffic, then work on improving pages that we've Noindex before switching them back to Index.
Basically following recommendations from this article: https://moz.com/blog/pruning-your-ecommerce-site
-
If the pages are in the index and you've recently added a NoIndex tag with the express purpose of getting them removed from the index, you may be better served doing crawl requests in Search Console of the pages in question.
-
Thanks for your response!
I did some more digging. This seems to contradict your suggestion:
https://twitter.com/methode/status/653980524264878080
If the goal is to have these pages removed from the index, and having them in the sitemap means they'll be picked up sooner by Google's crawler, then it seems to make sense that they should be included until they're removed from the index.
Am I misinterpreting this?
-
Hi
The reason you submit a sitemap to a searchengine is to ease and aid in crawling process for the pages that you want to get indexed. It speeds up the crawling process and lets search engine to discover all those pages that has no inner linkings to it etc..
A "noindex" tag does the opposite.
So no, you should not include noindex pages inside your sitemap files.
In general you should avoid pages that are not returning 200 also.Good luck
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Submitted URL has crawl issue - Submitted URL seems to be a Soft 404 - but all looks fine
Google Search Console is showing some pages up as "Submitted URL has crawl issue" but they look fine to me. I have set them as fixed but after a month they were finally re-crawled and google states the issue persists. Examples are: https://www.rscpp.co.uk/counselling/175809/psychology-alcester-lanes-end.html
Technical SEO | | TommyNewmanCEO
https://www.rscpp.co.uk/browse/location-index/889/index-of-therapy-in-hanger-lane.html
https://www.rscpp.co.uk/counselling/274646/psychology-waltham-forest-sexual-problems.html There's also some "Submitted URL seems to be a Soft 404": https://www.rscpp.co.uk/counselling/112585/counselling-moseley-depression.html I also have more which are "pending", but again I couldn't see a problem with them in the first place. I'm at a bit of a loss as to what to do next. Any advice? Thanks in advance.0 -
Sitemap url's not being indexed
There is an issue on one of our sites regarding many of the sitemap url's not being indexed. (at least 70% is not being indexed) The url's in the sitemap are normal url's without any strange characters attached to them, but after looking into it, it seems a lot of the url's get a #. + a number sequence attached to them once you actually go to that url. We are not sure if the "addthis" bookmark could cause this, or if it's another script doing it. For example Url in the sitemap: http://example.com/example-category/0246 Url once you actually go to that link: http://example.com/example-category/0246#.VR5a Just for further information, the XML file does not have any style information associated with it and is in it's most basic form. Has anyone had similar issues with their sitemap not being indexed properly ?...Could this be the cause of many of these url's not being indexed ? Thanks all for your help.
Technical SEO | | GreenStone0 -
Is there a maximum sitemap size?
Hi all, Over the last month we've included all images, videos, etc. into our sitemap and now its loading time is rather high. (http://www.troteclaser.com/sitemap.xml) Is there any maximum sitemap size that is recommended from Google?
Technical SEO | | Troteclaser0 -
URL Structure for Deal Aggregator
I have a website that aggregates deals from various daily deals site. I originally had all the deals on one page /deals, however I thought that maybe it might be more useful to have several pages e.g. /beautydeals or /hoteldeals. However if I give every section it's own page that means I have either no current deals on the main /deals page or I will have duplicate content. I'm wondering what might be the best approach here? A few of the options that come to mind are: 1. Return to having all the deals on one page /deals and linking internally to content within that page
Technical SEO | | andywozhere
2. Have both a main /deals page with all of the deals plus other pages such as /beautydeals, but add re="canonical" to point to the main /deals page
3. Create new content for the /deals page... however I think people will probably want to see at least some deals straight away, rather than having to click through to another page.
4. Display some sub-categories on the main /deals page, but have separate URLs for other more popular sub-categories e.g. /beautydeals (this is how it works at the moment) I should probably point out that the site also has other content such as events and a directory. Any suggestions on how best to approach this much appreciated! Cheers, Andy0 -
Structure of urls
**Hallo from Athens, Greece. We have to implement the following project and i need your help: ** We will build a company guide for the whole country and company local guides for each city for the same client. **Information of the country guide is the sum of information of local guides, so when a user is at the country guide he sees information from companies from all cities and when the user is at city guide he sees info only for the city. ** The problem is the structure of the url we should have. Should the page of presentation of each company should have structure as domain.gr/id/company? or city.domain.gr/id/company and the one to be canonical to the other? is this good for seo? Should both urls be included in the sitemap? Thank you
Technical SEO | | herculesopa0 -
Would these be considered dynamic URLs?
Hi, I have a (brand) new client (outdoor recreation), and it links to many different lodges. It's built in Wordpress (Pagelines), and the partner page link URLs. Although they do have the "?" in there, it's only has a single parameter. http://www.clientsite/?partners=partner-name Google is indexing the URLs, I do plan to increase the amount of content/on-page for each. Yet, weighing the risk/reward of rewriting all of these URLs.
Technical SEO | | csmithal0 -
Do I need an XML sitemap?
I have an established website that ranks well in Google. However, I have just noticed that no xml sitemap has been registered in Google webmaster tools, so the likelihood is that it hasn't been registered with the other search engines. However, there is an html sitemap listed on the website. Seeing as the website is already ranking well, do I still need to generate and submit an XML sitemap? Could there be any detriment to current rankings in doing so?
Technical SEO | | pugh0 -
Canonical URL Issue
Hi Everyone, I'm fairly new here and I've been browsing around for a good answer for an issue that is driving me nuts here. I tried to put the canonical url for my website and on the first 5 or 6 pages I added the following script SEOMoz reported that there was a problem with it. I spoke to another friend and he said that it looks like it's right and there is nothing wrong but still I get the same error. For the URL http://www.cacaniqueis.com.br/video-caca-niqueis.html I used the following: <link rel="<a class="attribute-value">canonical</a>" href="http://www.cacaniqueis.com.br/video-caca-niqueis.html" /> Is there anything wrong with it? Many thanks in advance for the attention to my question.. 🙂 Alex
Technical SEO | | influxmedia0