Pages to be indexed in Google

mtthompsons

Hi,

We have 70K posts in our site but Google has scanned 500K pages and these extra pages are category pages or User profile pages.

Each category has a page and each user has a page. When we have 90K users so Google has indexed 90K pages of users alone.

My question is. Should we leave it as they are or should we block them from being indexed? As we get unwanted landings to the pages and huge bounce rate.

If we need to remove what needs to be done? Robots block or Noindex/Nofollow

Regards

CleverPhD

Thank you Gagan!

Modi

Its a much better and clear explanation... +1 to it. Cheers !!

CleverPhD

One key point on using robots.txt vs the meta tag noindex. It is not that the noindex meta tag is "superior" they just work differently.

If you use robots.txt - it will stop the spider from visiting that page, but it will not remove the page from the index. Also, if you have a page in robots.txt and on that page have a 301 redirect, or a canonical or a meta noindex Google will not see the page (due to the robots.txt directive) and then not be able to act on the 301 or canonical or the meta noindex.

A meta noindex, because the spider crawls the page, will not only tell Google not to visit the page anymore, but also tells Google to remove the page from the index. This is key if you want the pages removed from the Google index.

The rule of thumb I use is that

If you have a page that is not in the Google index and you want to keep it out of the index put that file in robots.txt.
If you have a page that is in the Google index and you want it removed, then use the noindex meta tag, do not put it into the robots.txt for reasons mentioned above. Over time, once the pages are removed (and this may take a while depending on how often the page is cralwed) then you can put into robots.txt for good measure.

Modi

In order to exclude individual pages from search engine indices, **the noindex meta tag **is actually superior to robots.txt.

Refer - http://moz.com/learn/seo/robotstxt

mtthompsons

Noindex is good or robots deny

Whats the difference or can do both?

Modi

If they have pretty low content or do not add any value and is not searched by users too

Will be better to add noindex so as to have search engines crawl your site in a better way.

Gijsbert

if those are generating a high bounce rate I would block them for search engines. The easiest way is probably by a robots.txt

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Pages to be indexed in Google

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Removing a site from Google index with no index met tags

Google Indexing Pages with Made Up URL

My video sitemap is not being index by Google

Google is indexing blocked content in robots.txt

/index.php/ page

Sending signals to Google to rank the correct page for a set of Keywords.

Does page speed affect what pages are in the index?

Over 1000 pages de-indexed over night