Google only crawling a small percentage of the sitemap
-
Hi,
The company which I work for have developed a new website for a customer, there URL is https://www.wideformatsolutions.co.uk I've created a sitemap which has 25,555 URL's. I submitted this to Google around 4 weeks ago and the most crawls that have ever occurred has been 2,379.
I've checked everything I can think of, including;
- Speed of website
- Canonical Links
- 404 errors
- Setting a preferred domain
- Duplicate content
- Robots Txt
- .htaccess
- Meta Tags
I did read that Matt Cutts revealed in an interview with Eric Enge that the number of pages Google crawls is roughly proportional to your pagerank. But I'm sure it should crawl more than 2000 pages.
The website is based on Opencart, if anyone has experienced anything like this I would love hear from you.
-
No problem! I meant to mention this in my first comment, but I also noticed that there's no robots.txt file in place. That's obviously not going to help your indexation problem too much, but nonetheless something you should know about.
-
I did have some issues with this when we first launched the site, I will try and look into it further now. The HTTPS certificate is fairly new.
Thanks for commenting
-
Looks to me like Google can't properly access your XML sitemap. I tried to put it into 2 different validator tools and URI Valet and none of those tools were able to access it. It could be something with HTTPS. Did you recently switch the site over to secure?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing Submitted Images
Hi Guys! My question isn't too dissimilar to one asked a couple of years ago, regarding Google and image indexing, but having put my web address into a Google image search, I get a return of 15 images, so something isn't right. 5 months ago I submitted our 'new' site to Google webmaster. We have just moved it onto a Shopify platform. They (Shopify) are good at providing places to add titles and Alt tags and likewise we fill them in (so that box ticked!) However I have noticed over the last couple of months that despite 161 images being submitted, only 51 have been indexed. Furthermore and as I said earlier, when you put our site, site:http://www.hartnackandco.com into Google images, it only returns a total of 15 images. Any suggestions and help would be wonderful! Cheers Nick
Technical SEO | | nick_HandCo0 -
Sitemap international websites
Hey Mozzers,Here is the case that I would appreciate your reply for: I will build a sitemap for .com domain which has multiple domains for other countries (like Italy, Germany etc.). The question is can I put the hreflang annotations in sitemap1 only and have a sitemap 2 with all URLs for EN/default version of the website .COM. Then put 2 sitemaps in a sitemap index. The issue is that there are pages that go away quickly (like in 1-2 days), they are localised, but I prefer not to give annotations for them, I want to keep clear lang annotations in sitemap 1. In this way, I will replace only sitemap 2 and keep sitemap 1 intact. Would it work? Or I better put everything in one sitemap?The second question is whether you recommend to do the same exercise for all subdomains and other domains? I have read much on the topic, but not sure whether it worth the effort.The third question is if I have www.example.it and it.example.com, should I include both in my sitemap with hreflang annotations (the sitemap on www.example.com) and put there it for subdomain and it-it for the .it domain (to specify lang and lang + country).Thanks a lot for your time and have a great day,Ani
Technical SEO | | SBTech0 -
Absurdly High Crawl Stats
Over the past month and a half, our crawl stats have been rising violently. A few weeks ago, our crawl stats rose, such that the pages crawled per day worked out to the entire site being crawled 6 times a day, with a corresponding rise in KB downloaded per day. Last week, the crawl rate jumped again, such that the site is being crawled roughly 30x a day. I'm not seeing any chatter at there about an algorithm change, and I've checked and double-checked the site for signs of duplicate content, changes in our backlink profile, or anything else. We haven't seen appreciable changes in our search volume, either impressions or clicks. Any ideas what could be going on?
Technical SEO | | Tyler-Brown0 -
XML Sitemap Creation
I am looking for a tool where I can add a list of URL's and output an XML sitemap. Ideally this would be Web based or work on the mac? Extra bonus if it handles video sitemaps. My alternative is XLS and a bunch of concatenates, but I'd rather something cleaner. It doesn't need to crawl the site. Thanks.
Technical SEO | | Jeff_Lucas0 -
Benefits to having an HTML sitemap?
We are currently migrating our site to a new CMS and in part of this migration I'm getting push-back from my development team regarding the HTML sitemap. We have a very large news site with 10s of thousands of pages. We currently have an HTML sitemap that greatly helps with distributing PR to article pages, but is not geared towards the user. The dev team doesn't see the benefit to recreating the HTML sitemap despite my assurance that we don't want to lose all these internal links since removing 1000s of links could have a negative impact on our Domain Authority. Should I give in and concede the HTML sitemap since we have an XML one? Or am I right that we don't want to get rid of it?
Technical SEO | | BostonWright0 -
Google previews meanings
I have noticed that when I do a site:mysite.com search and all my indexed pages come up, that when I hover over the arrows and bring up the previews some of the pages have things in a red rectangle and some of the pages even have zig zag-like tears/perforations running through them. What do these signify?
Technical SEO | | VictorVC0 -
Will Google Continue to Index the Page with NoIndex Tag Upon Google +1 Button Impression or Click?
The FAQs for Google +1 button suggests as follows: "+1 is a public action, so you should add the button only to public, crawlable pages on your site. Once you add the button, Google may crawl or recrawl the page, and store the page title and other content, in response to a +1 button impression or click." If my page has NoIndex tag, while at the same time inserted with Google +1 button on the page, will Google recognise the NoIndex Tag on the page (and will not index the page) despite the +1 button's impression or clicks send signals to Google spiders?
Technical SEO | | globalsources.com0 -
Crawl Errors
Okay, I was just in my Google Webmaster Tools and was looking at some of the stats. I have 1354 "not found" pages google says. Many of these URL's are bizarre. I don't know what they are. Others I do know. What should I do about this? Especially all the URL's I don't even know what they are?
Technical SEO | | azguy0