Google only crawling a small percentage of the sitemap
-
Hi,
The company which I work for have developed a new website for a customer, there URL is https://www.wideformatsolutions.co.uk I've created a sitemap which has 25,555 URL's. I submitted this to Google around 4 weeks ago and the most crawls that have ever occurred has been 2,379.
I've checked everything I can think of, including;
- Speed of website
- Canonical Links
- 404 errors
- Setting a preferred domain
- Duplicate content
- Robots Txt
- .htaccess
- Meta Tags
I did read that Matt Cutts revealed in an interview with Eric Enge that the number of pages Google crawls is roughly proportional to your pagerank. But I'm sure it should crawl more than 2000 pages.
The website is based on Opencart, if anyone has experienced anything like this I would love hear from you.
-
No problem! I meant to mention this in my first comment, but I also noticed that there's no robots.txt file in place. That's obviously not going to help your indexation problem too much, but nonetheless something you should know about.
-
I did have some issues with this when we first launched the site, I will try and look into it further now. The HTTPS certificate is fairly new.
Thanks for commenting
-
Looks to me like Google can't properly access your XML sitemap. I tried to put it into 2 different validator tools and URI Valet and none of those tools were able to access it. It could be something with HTTPS. Did you recently switch the site over to secure?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Picking Up Posts
I am trying to work out why from March 4th Google is not seeing my posts. Our google impressions have dropped from 8,000 to 40. If you put in the full article name with speach marks it does not find it, and instead shows the home page in google. We have not had any warnings. We did have work done on our site but nothing else i could think of to cause this. Can anyone let me know what may have caused this. All articles are original
Technical SEO | | headlinesplus0 -
Google Does not find Internal links
Hi guys I involved in difficult situation. in google webmaster tools -> internal links some important pages doesn't have any links from all pages. for example home page just have 9000 internal inks but there are 29000 indexed pages by google and some not important pages have 27000 internal links.(more than home page) Site made by angular v1 Is there anyone can help me why google could not find all internal links?
Technical SEO | | cafegardesh0 -
Website crawl error
Hi all, When I try to crawl a website, I got next error message: "java.lang.IllegalArgumentException: Illegal cookie name" For the moment, I found next explanation: The errors indicate that one of the web servers within the same cookie domain as the server is setting a cookie for your domain with the name "path", as well as another cookie with the name "domain" Does anyone has experience with this problem, knows what it means and knows how to solve it? Thanks in advance! Jens
Technical SEO | | WeAreDigital_BE0 -
Crawl depth and www
I've run a crawl on a popular amphibian based tool, just wanted to confirm... should http://www.homepage be at crawl depth 0 or 1? The audit shows http://homepage at level 0 and http://www.homepage at level 1 through a redirect. Thanks
Technical SEO | | Focus-Online-Management0 -
Website not crawled
i added website www.nsale.in in add campaign, it shows only 1 page crawled. but its working fine for other sites, any idea why it failed ?
Technical SEO | | Dhinesh0 -
Sitemap coming up in Google's index?
I apologize if this question's answer is glaringly obvious, but I was using Google to view all the pages it has indexed of our site--by searching for our company and then clicking the link that says to display more results for the site. On page three, it has the sitemap indexed as if it wee just another page of our site. <cite>www.stadriemblems.com/sitemap.xml</cite> Is this supposed to happen?
Technical SEO | | UnderRugSwept0 -
On-Site Sitemaps - Guidance Required
Hi, I am looking to find good examples of on-site sitemaps. We already submit our XML sitemap regularly through GWMT but I now wonder if we still need an on-site sitemap, as we have about 30 static pages and 300+ Wordpress blogs which in a sense makes that a spammy page as it has too many links and a higher than average keyword density. The reason I am looking for good examples is that I want to create a basic on-site sitemap that aids navigation but is styled to look ok as well. The Solution I have in mind: mydomain.com/link-example-one.php
Technical SEO | | tdsnet
mydomain.com/link-example-two.php
mydomain.com/liink-example-ten.php mydomain.com/blog then links to my 300 WP blogs, broken down into chunks navigated by using breadcrumbs. Will Google crawl this ok or should I stick to the current format listing ALL posts on one page? Thanks0 -
Google Off/On Tags
I came across this article about telling google not to crawl a portion of a webpage, but I never hear anyone in the SEO community talk about them. http://perishablepress.com/press/2009/08/23/tell-google-to-not-index-certain-parts-of-your-page/ Does anyone use these and find them to be effective? If not, how do you suggest noindexing/canonicalizing a portion of a page to avoid duplicate content that shows up on multiple pages?
Technical SEO | | Hakkasan1