Clarification on indexation of XML sitemaps within Webmaster Tools
-
Hi Mozzers,
I have a large service based website, which seems to be losing pages within Google's index. Whilst working on the site, I noticed that there are a number of xml sitemaps for each of the services. So I submitted them to webmaster tools last Friday (14th) and when I left they were "pending".
On returning to the office today, they all appear to have been successfully processed on either the 15th or 17th and I can see the following data:
13/08 - Submitted=0 Indexed=0
14/08 - Submitted=606,733 Indexed=122,243
15/08 - Submitted=606,733 Indexed=494,651
16/08 - Submitted=606,733 Indexed=517,527
17/08 - Submitted=606,733 Indexed=517,498Question 1: The indexed pages on 14th of 122,243 - Is this how many pages were previously indexed? Before Google processed the sitemaps? As they were not marked processed until 15th and 17th?
Question 2: The indexed pages are already slipping, I'm working on fixing the site by reducing pages and improving internal structure and content, which I'm hoping will fix the crawling issue. But how often will Google crawl these XML sitemaps?
Thanks in advance for any help.
-
Hi again
This means that because you have multiple sitemaps, Google is going to crawl those at different times possibly and at different rates, hence some of your sitemaps taking a day longer. I really wouldn't look into it too much, and just be assured that Google is crawling your sitemaps fine and indexing.
If you notice major discrepancies in what you submitted and what's being indexed, then I would refer to this Google resource on how to fix issues or errors you find in your sitemap crawl.
Hope this helps! Good luck!
-
Hi there
You submitted on the 13th, there were 0 pages indexed. The next day there were 122,243, so in that time period, Google indexed 122,243 of your site's pages.
This is a day by day process. So whatever new number appears on each day, subtract the previous day's number from your present day number, and that's how many pages were freshly indexed.
Hope this helps! Good luck!
-
Just checked webmaster tools again, and now they (sitemaps) all say processed on 17th and some say 18th (today) does this mean the sitemaps are being processed by Google every couple of days?
-
Hi Patrick,
Thanks for elaborating on question 2.
Question 1, I asked if the number (122,243) was how many pages were in the index** before** google processed the sitemaps, as they don't appear to have been processed until the following day.
You answered, yes but then said its how many pages were processed that day?
Thanks again for your time and clarification.
-
Hi there
Question 1 - Yes, this is how many pages Google indexed from your sitemap on that day.
Question 2 - XML sitemaps allow you to tell Google the change frequency of your URLs - you can learn more about that here. Also, according to Google:
Google's spiders regularly crawl the web to rebuild our index. Crawls are based on many factors such as PageRank, links to a page, and crawling constraints such as the number of parameters in a URL. Any number of factors can affect the crawl frequency of individual sites.
Our crawl process is algorithmic; computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. We don't accept payment to crawl a site more frequently. For tips on maintaining a crawler-friendly website, please visit our Webmaster Guidelines.
Please let me know if you have any further questions or comments.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index problems
“The website http://www.vaneyckshutters.com/nl/ does not show in the index of Google (site:vaneyckshutters.com/nl/). This must be the homepage in the Netherlands. Previously, the page www.vaneyckshutters.com was redirected to /nl/. This page is accessible now with a canonical tag to http://www.vaneyckshutters.com/nl/ in the hope to let /nl/ be indexed. When we look at the SERPS for keyword ‘shutters’, the page http://www.vaneyckshutters.com/ is shown in Google.nl on #32 and in Belgium #3. Problem & question: Why is it that /nl/ has not been indexed properly and why is it that we rank with http://www.vaneyckshutters.com on ‘shutters’ instead the/nl/ page?”
Technical SEO | | Happy-SEO1 -
How to change noindex to index?
Hey, I've recently upgraded to a pro SEOmoz account and have realised i have 14574 issues to do with 'blocked by meta-robot' and that 'This page is being kept out of the search engine indexes by the meta tag , which may have a value of "noindex", keeping this page out of the index.' How can i change this so my pages get indexed? I read somewhere that i need to change my privacy settings but that thread was 3 years old and now the WP Dashboard has updated.. Please let me know Many thanks, Jamie P.s Im using WordPress 3.5 And i have the XML sitemap plugin And i have no idea where to look for this robots.txt file..
Technical SEO | | markgreggs0 -
Can you have a /sitemap.xml and /sitemap.html on the same site?
Thanks in advance for any responses; we really appreciate the expertise of the SEOmoz community! My question: Since the file extensions are different, can a site have both a /sitemap.xml and /sitemap.html both siting at the root domain? For example, we've already put the html sitemap in place here: https://www.pioneermilitaryloans.com/sitemap Now, we're considering adding an XML sitemap. I know standard practice is to load it at the root (www.example.com/sitemap.xml), but am wondering if this will cause conflicts. I've been unable to find this topic addressed anywhere, or any real-life examples of sites currently doing this. What do you think?
Technical SEO | | PioneerServices0 -
Do Seomozers recommend sitemaps.xml or not. I'm thoroughly confused now. The more I read, the more conflicted I get
I realize I'm probably opening a can of worms, but here we go. Do you or do you not add a sitemap.xml to a clients site?
Technical SEO | | catherine-2793880 -
Disappeared from Google with in 2 hours of webmaster tools error
Hey Guys I'm trying not to panic but....we had a problem with google indexing some of our secure pages then hit those pages and browsers firing up security warning, so I asked our web dev to have at look at it he made the below changes and within 2 hours the site has drop off the face of google “in web master tools I asked it to remove any https://freestylextreme.com URLs” “I cancelled that before it was processed” “I then setup the robots.txt to respond with a disallow all if the request was for an https URL” “I've now removed robots.txt completely” “and resubmitted the main site from web master tools” I've read a couple of blog posts and all say to remain clam , test the fetch bot on webmasters tools which is all good and just wait for google to reindex do you guys have any further advice ? Ben
Technical SEO | | elbeno1 -
Getting Posts Indexed
On a Wordpress site I'm working on you can get to any product from home in 2 clicks but I'm a llittle concerned about the URL which looks like this: domain/categoryname/subcategoryname/productpage Will I have trouble getting my products indexed?
Technical SEO | | waynekolenchuk0 -
Google Webmaster Tools: Keywords
Hi SEOmozzers! I'm the Dr./owner/in-house SEO for my eye care practice. The URL is www.ofallonfamilyeyecare.com. Our practice is in O'Fallon, MO. Since I'm an optometrist, my main keywords are "optometrist o'fallon" and "o'fallon optometrist". As I get more familiarity with SEO, Google Analytics and Webmaster Tools, I've discovered the Keywords that Google feels best represent my website. About a week ago I noted Google counted 21 instances of "optometrist" on the 28-30 pages of my website, which ranks as #32 in the most common keywords. #1 is "eye" with 506 instances. Even though 21 occurrences seemed low, I went though every page adding "optometrist" a couple times in the body where it would naturally be appropriate. I also added it to the address shown on the footer of every page. I changed the top navigation option of "meet Dr. Hegyi" to "our optometrist". I must have added at least 4 occurrences to every page on my site, and submitted for a re-crawl. I even tried to scale back the "eye" occurrences on a few pages. Today I see that Google has re-crawled the site and the keywords have been updated. "Optometrist has DROPPED from #32 to #33. Does anyone have any ideas or suggestions why I'm not seeing increased occurrence in Googles eyes? I realize this may not be a big factor in SERPs, but every bit of on-page optimization helps. Or is this too minor of an issue to sweat? Thanks!
Technical SEO | | JosephHegyi0 -
What tool do you use to check for URLs not indexed?
What is your favorite tool for getting a report of URLs that are not cached/indexed in Google & Bing for an entire site? Basically I want a list of URLs not cached in Google and a seperate list for Bing. Thanks, Mark
Technical SEO | | elephantseo3