Only half of the sitemap is indexed
-
I have a website with high domain authority and high quality content and blog. I've resubmitted the sitemap half a dozen times. Search console getr half way through and then stops. Does anyone know any reason for this?
I've seen the usual responses of 'google is not obligated to crawl you' but this site has been fully crawled in the past. It's very odd
Does anyone have any ideas why it might stop half way - or does anyone know a testing tool that might illuminate the situation?
-
Hi Andrew
Here a few things to check or rule out:
-
Are those pages accessible to be crawled (not blocked with robots.txt etc)
-
Are they also internally linked? (ie;s crawl with Screaming Frog, starting at the homepage and see if they turn up)
-
Is the page actually indexed (search the URL in Google) but just not showing up in Search Console?
-
How long are you waiting before resubmitting - also does it literally get half way down the list, or do you mean 50% are not indexed?
Overall, I would just submit the sitemap and you don't need to keep resubmitting. I would rather do some crosschecks to make sure the URL is accessible (crawlable) and even maybe indexed already, just not showing in the report. Usually, there's some other issue with the URL besides a sitemap issue - and like you mentioned, I'm not sure how long you're waiting, but it can indeed take weeks for them to show up.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website have Caching/Indexing / Ranking Issue
Hi, My Website (https://www.v3cars.com) is not cached or indexed on regular basic from last 15 days. before this it was cached or indexed on regular basic. We are uploading fresh content on daily basic. Currently my new content is not ranked anywhere in Google even after cached or indexed. Please help and suggest. Sandeep - Love to Cars
Algorithm Updates | | onlinesandeep0 -
Indexed, though blocked by robots.txt: Need to bother?
Hi, We have intentionally blocked some of the website files which were indexed for years. Now we receive a message "Indexed, though blocked by robots.txt" in GSC. We can ignore as per my knowledge? Are any actions required about this? We thought of blocking them with meta tags but these are PDF files. Thanks
Algorithm Updates | | vtmoz1 -
Meta robots at every page rather than using robots.txt for blocking crawlers? How they'll get indexed if we block crawlers?
Hi all, The suggestion to use meta robots tag rather than robots.txt file is to make sure the pages do not get indexed if their hyperlinks are available anywhere on the internet. I don't understand how the pages will be indexed if the entire site is blocked? Even though there are page links are available, will Google really index those pages? One of our site got blocked from robots file but internal links are available on internet for years which are not been indexed. So technically robots.txt file is quite enough right? Please clarify and guide me if I'm wrong. Thanks
Algorithm Updates | | vtmoz0 -
Is it stil a rule that Google will only index pages up to three tiers deep? Or has this changed?
I haven't looked into this in a while, it used to be that you didn't want to bury pages beyond three clicks from the main page. What is the rule now in order to have deep pages indexed?
Algorithm Updates | | seoessentials0 -
A few sitemap questions
1. When I do a sitemap through a generator, it lists some of my URLs twice, with and without the last slash. Ex: <url><loc>http://www.howlatthemoon.com/locations/location-hollywood</loc><lastmod>2013-11-25T16:12:50+00:00</lastmod><changefreq>daily</changefreq><priority>0.9</priority></url> <url><loc>http://www.howlatthemoon.com/locations/location-hollywood/</loc><lastmod>2013-11-25T16:14:27+00:00</lastmod><changefreq>daily</changefreq><priority>0.69</priority></url> Should I remove one of these or leave it? 2. What is the importance of lastmod? I've read that if you have a lastmod listed, Google won't recrawl until a new time/date is up? 3. This goes along with lastmod, but is changefreq important? Can it hurt me at all?
Algorithm Updates | | howlusa0 -
When Google crawls and indexes a new page does it show up immediately in Google search - "site;"?
We made changes to a site, including the addition of a new page and corresponding link/text changes to existing pages. The changes are not yet showing up in the Google index (“site:”/cache), but, approximately 24 hours after making the changes, The SERP's for this site jumped up. We obtained a new back link about a couple of weeks ago, but it is not yet showing up in OSE, Webmaster Tools, or other tools. Just wondering if you think the Google SERP changes run ahead of what they actually show us in site: or cache updates. Has Google made a significant SERP “adjustment” recently? Thanks.
Algorithm Updates | | richpalpine0 -
Is URL appearance defined by crawling or by XML sitemap
I am having a problem developing a sitemap because I have long URLs that are made by zend. They go like this: http://myagingfolks.com/professionals/20661/social-workers/pennsylvania-civi-stanger Because these URL's are long and are fed by Zend when I try to call them all up, to put on the sitemap, the system runs out of memory and crashes. Do you know what part of a search result, in google, say, comes from the URL? Would it be fine for me to submit to google only www.myagingfolks.com/professionals/20661. Does the crawler find that the URL is indeed http://myagingfolks.com/professionals/20661/social-workers/pennsylvania-civi-stanger or does it go with just what the sitemap tells it?
Algorithm Updates | | Jordanrg0 -
Tuesday July 12th = We suddenly lost all our top Google rankings. Traffic cut in half. Ideas?
The attached screenshot shows all. Panda update hit us hard = we lost half our traffic. Three months later, Panda tweak gave us traffic back. Now, this past Tuesday we lost half our traffic again and ALL our top ranking Keywords/phrases on Google (all other search engines keywords holding rank fine). Did they tweak their algorithm again? What are we doing wrong?? eartheasy.com wtf.jpg
Algorithm Updates | | aran0880