Pages not indexed by Google

Probikeshop

We recently deleted all the nofollow values on our website. (2 weeks ago)

The number of pages indexed by google is the same as before?

Do you have explanations for this?

website : www.probikeshop.fr

loopyal

Good advice from Andrea and Brent.

To use multiple sitemaps, do something like this:

The main sitemap points to the other sitemap files.

You can have up to 50,000 URLs in those files.

mine are gzipped

This one is sitemap_index.xml

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><sitemap><loc>http://yourdomain.com/writermap.xml.gz</loc>
<lastmod>2012-03-15</lastmod></sitemap>
<sitemap><loc>http://yourdomain.com/mainmap.xml.gz</loc>
<lastmod>2012-03-15</lastmod></sitemap>
<sitemap><loc>http://yourdomain.com/201201.xml.gz</loc>
<lastmod>2012-03-15</lastmod></sitemap></sitemapindex>

<sitemap><loc>http://yourdomain.com/201202.xml.gz</loc>
<lastmod>2012-03-15</lastmod></sitemap>

Here is a tip:

Google will index some of those pages and some it will not index.

If you have 5,000 urls in one sitemap and they only index 4957

you probably can't work out which 43 URLs they didn't index,

so if you make the numbers smaller, it can be easier to discover the pages they don't like.

not easy, but easier

josh-riley

Well, there's a lot of ways to look at this - this wouldn't result in more pages indexed, so the two issues are totally separate.

If the goal is to get more pages indexed, then a site map (either XML or event a text list) uploaded to your server for Google to find can help. Or, at least that makes sure that Google is finding and indexing the pages you want them to find. Your Google Webmaster Tools account (assuming you have one) will also tell you some data.

For example, we used to have 100K+ pages; many weren't quality content I wanted to rank. Like, a PDF of a catalog ranking about the product page. So, I reduced the number of pages indexed so Google would have better, more quality content to serve to searchers.

Using Xenu or Screaming Frog is another good way to help uncover pages. Those tools crawl your site like Google would,then you can download the file and not only see all the URLs found, but also if they are 301/404/200, etc. And, Screaming Frog can crawl your site and output a XML sitemap for you (it's an easier way to make one).

I prefer SF and it's about $150 US dollars for the use - well worth it.

As for why - well, if you have a lot of pages, Google doesn't always find them. That's where a site map can help (it directs Google what to crawl). Otherwise, there could be technical issues to a bunch of pages and they aren't properly linked up or something and that could be causing the issue.

Probikeshop

So according to you, it's normal if we dont have more pages indexed by Google, since we have deleted the nofollow values?

Google actually index 28,200 pages, but i'm sure we have more pages on site.

From where, could come the problem?

Thanks

Copstead

Do you have XML sitemaps? If not this is a great way to measure what is being indexed by Google. Make sure you create multiple sitemaps based on your categories so you can track exactly which pages are not being indexed.

josh-riley

'No follow' isn't the same as a 'no index' code. No follow just tells the search engine it "should not influence the link target's ranking in the search engine's index." 'No index' is where you tell the crawler to not index the pages, then you can remove that if you at some future point want them indexed.

So, in theory, what you did wouldn't have anything to do with how many pages are indexed on your site anyway.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Pages not indexed by Google

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Home Page Being Indexed / Referral URLs /

How long does Google takes to re-index title tags?

Can you noindex a page, but still index an image on that page?

Should We Index These Category Pages?

Huge number of indexed pages with no content

I have 15,000 pages. How do I have the Google bot crawl all the pages?

Removing a site from Google's index

Is this 404 page indexed?