Clarification on indexation of XML sitemaps within Webmaster Tools
-
Hi Mozzers,
I have a large service based website, which seems to be losing pages within Google's index. Whilst working on the site, I noticed that there are a number of xml sitemaps for each of the services. So I submitted them to webmaster tools last Friday (14th) and when I left they were "pending".
On returning to the office today, they all appear to have been successfully processed on either the 15th or 17th and I can see the following data:
13/08 - Submitted=0 Indexed=0
14/08 - Submitted=606,733 Indexed=122,243
15/08 - Submitted=606,733 Indexed=494,651
16/08 - Submitted=606,733 Indexed=517,527
17/08 - Submitted=606,733 Indexed=517,498Question 1: The indexed pages on 14th of 122,243 - Is this how many pages were previously indexed? Before Google processed the sitemaps? As they were not marked processed until 15th and 17th?
Question 2: The indexed pages are already slipping, I'm working on fixing the site by reducing pages and improving internal structure and content, which I'm hoping will fix the crawling issue. But how often will Google crawl these XML sitemaps?
Thanks in advance for any help.
-
Hi again
This means that because you have multiple sitemaps, Google is going to crawl those at different times possibly and at different rates, hence some of your sitemaps taking a day longer. I really wouldn't look into it too much, and just be assured that Google is crawling your sitemaps fine and indexing.
If you notice major discrepancies in what you submitted and what's being indexed, then I would refer to this Google resource on how to fix issues or errors you find in your sitemap crawl.
Hope this helps! Good luck!
-
Hi there
You submitted on the 13th, there were 0 pages indexed. The next day there were 122,243, so in that time period, Google indexed 122,243 of your site's pages.
This is a day by day process. So whatever new number appears on each day, subtract the previous day's number from your present day number, and that's how many pages were freshly indexed.
Hope this helps! Good luck!
-
Just checked webmaster tools again, and now they (sitemaps) all say processed on 17th and some say 18th (today) does this mean the sitemaps are being processed by Google every couple of days?
-
Hi Patrick,
Thanks for elaborating on question 2.
Question 1, I asked if the number (122,243) was how many pages were in the index** before** google processed the sitemaps, as they don't appear to have been processed until the following day.
You answered, yes but then said its how many pages were processed that day?
Thanks again for your time and clarification.
-
Hi there
Question 1 - Yes, this is how many pages Google indexed from your sitemap on that day.
Question 2 - XML sitemaps allow you to tell Google the change frequency of your URLs - you can learn more about that here. Also, according to Google:
Google's spiders regularly crawl the web to rebuild our index. Crawls are based on many factors such as PageRank, links to a page, and crawling constraints such as the number of parameters in a URL. Any number of factors can affect the crawl frequency of individual sites.
Our crawl process is algorithmic; computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. We don't accept payment to crawl a site more frequently. For tips on maintaining a crawler-friendly website, please visit our Webmaster Guidelines.
Please let me know if you have any further questions or comments.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Weird 404 errors in Webmaster Tools
Hi, In a regular check with Webmaster Tools, I have noticed some weird 404 errors, for example, my domain URL is something like http://domainname.com/, the 404 error points to some weird URLs like http://domainname.com/james-bond&page=2/ and http://domainname.com/juegos-de&page=3/, at first I have tried to block them by robots.txt, but now I am getting these kind of 404 errors a lot, and don't think blocking them all is a perfect solution. Can anyone help me out with the issue? Thank you in advance.
Technical SEO | | nishthaj
cheers.0 -
Do you Index your Image Repository?
On our backend system, when an image is uploaded it is saved to a repository. For example: If you upload a picture of a shark it will go to - oursite.com/uploads as shark.png When you use a picture of this shark on a blog post it will show the source as oursite.com/uploads/shark.png This repository (/uploads) is currently being indexed. Is it a good idea to index our repository? Will Google not be able to see the images if it can't crawl the repository link (we're in the process of adding alt text to all of our images ). Thanks
Technical SEO | | SteveDBSEO0 -
Verify all versions of site in Bing Webmaster Tools
Hello, We recently migrated our site to a new shopping cart, https, and from www to non-www, and it's been a rough transition. We've lost a lost of traffic particularly in Bing. All the versions of our site are verified Google WMT, sitemaps are submitted correctly, etc. Unfortunately, this was not done for Bing. Currently only the new version of our site (https, non-www) is verified in Bing WMT. Do we have to verify all versions of our site in Bing, the way they are in Google WMT? Also, now that it's been a few months since the switch, should we still submit a site move to Bing WMT or is it too late? Thanks in advance!
Technical SEO | | whiteonlySEO0 -
No Index PDFs
Our products have about 4 PDFs a piece, which really inflates our indexed pages. I was wondering if I could add REL=No Index to the PDF's URL? All of the files are on a file server, so they are embedded with links on our product pages. I know I could add a No Follow attribute, but I was wondering if any one knew if the No Index would work the same or if that is even possible. Thanks!
Technical SEO | | MonicaOConnor0 -
Can I have an http AND a https site on Google Webmaster tools
My website is https but the default property that was configured on Google WMT was http and wasn't showing me any information because of that. I added an https property for that, but my question is: do I need to delete the original HTTP or can I leave both websites?
Technical SEO | | Onboard.com0 -
Site Indexed but not Cached?
I launched a new website ~2 weeks ago that seems to be indexed but not cached. According to Google Webmaster most of the pages are indexed and I see them appear when I search site:www.xxx.com. However, when I type into the URL - cache:www.xxx.com I get a 404 error page from Google.
Technical SEO | | theLotter
I've checked more established websites and they are cached so I know I am checking correctly here... Why would my site be indexed but not in the cache?0 -
My backlinks do not register on your software but are registered on my google webmaster tools
Why are my back links not being recognized by any software other than google webmaster?
Technical SEO | | SteveK640 -
ROR Sitemap
Do search engines Read RoR sitemaps ? Are they necessary ? Isn't xml sitemap enough.
Technical SEO | | seoug_20050