Google only crawling a small percentage of the sitemap

chrissmithps

Hi,

The company which I work for have developed a new website for a customer, there URL is https://www.wideformatsolutions.co.uk I've created a sitemap which has 25,555 URL's. I submitted this to Google around 4 weeks ago and the most crawls that have ever occurred has been 2,379.

I've checked everything I can think of, including;

Speed of website
Canonical Links
404 errors
Setting a preferred domain
Duplicate content
Robots Txt
.htaccess
Meta Tags

I did read that Matt Cutts revealed in an interview with Eric Enge that the number of pages Google crawls is roughly proportional to your pagerank. But I'm sure it should crawl more than 2000 pages.

The website is based on Opencart, if anyone has experienced anything like this I would love hear from you.

LoganRay

No problem! I meant to mention this in my first comment, but I also noticed that there's no robots.txt file in place. That's obviously not going to help your indexation problem too much, but nonetheless something you should know about.

chrissmithps

I did have some issues with this when we first launched the site, I will try and look into it further now. The HTTPS certificate is fairly new.

Thanks for commenting

LoganRay

Looks to me like Google can't properly access your XML sitemap. I tried to put it into 2 different validator tools and URI Valet and none of those tools were able to access it. It could be something with HTTPS. Did you recently switch the site over to secure?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Google only crawling a small percentage of the sitemap

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Why Google crawl parameter URLs?

Image Sitemap

Google Not Indexing Submitted Images

Has Google Stopped Listing URLs with Crawl Errors in Webmaster Tools?

Host sitemaps on S3?

Is google all over the place tonight?

Magento - Google Webmaster Crawl Errors

Crawling image folders / crawl allowance