Clarification on indexation of XML sitemaps within Webmaster Tools
-
Hi Mozzers,
I have a large service based website, which seems to be losing pages within Google's index. Whilst working on the site, I noticed that there are a number of xml sitemaps for each of the services. So I submitted them to webmaster tools last Friday (14th) and when I left they were "pending".
On returning to the office today, they all appear to have been successfully processed on either the 15th or 17th and I can see the following data:
13/08 - Submitted=0 Indexed=0
14/08 - Submitted=606,733 Indexed=122,243
15/08 - Submitted=606,733 Indexed=494,651
16/08 - Submitted=606,733 Indexed=517,527
17/08 - Submitted=606,733 Indexed=517,498Question 1: The indexed pages on 14th of 122,243 - Is this how many pages were previously indexed? Before Google processed the sitemaps? As they were not marked processed until 15th and 17th?
Question 2: The indexed pages are already slipping, I'm working on fixing the site by reducing pages and improving internal structure and content, which I'm hoping will fix the crawling issue. But how often will Google crawl these XML sitemaps?
Thanks in advance for any help.
-
Hi again
This means that because you have multiple sitemaps, Google is going to crawl those at different times possibly and at different rates, hence some of your sitemaps taking a day longer. I really wouldn't look into it too much, and just be assured that Google is crawling your sitemaps fine and indexing.
If you notice major discrepancies in what you submitted and what's being indexed, then I would refer to this Google resource on how to fix issues or errors you find in your sitemap crawl.
Hope this helps! Good luck!
-
Hi there
You submitted on the 13th, there were 0 pages indexed. The next day there were 122,243, so in that time period, Google indexed 122,243 of your site's pages.
This is a day by day process. So whatever new number appears on each day, subtract the previous day's number from your present day number, and that's how many pages were freshly indexed.
Hope this helps! Good luck!
-
Just checked webmaster tools again, and now they (sitemaps) all say processed on 17th and some say 18th (today) does this mean the sitemaps are being processed by Google every couple of days?
-
Hi Patrick,
Thanks for elaborating on question 2.
Question 1, I asked if the number (122,243) was how many pages were in the index** before** google processed the sitemaps, as they don't appear to have been processed until the following day.
You answered, yes but then said its how many pages were processed that day?
Thanks again for your time and clarification.
-
Hi there
Question 1 - Yes, this is how many pages Google indexed from your sitemap on that day.
Question 2 - XML sitemaps allow you to tell Google the change frequency of your URLs - you can learn more about that here. Also, according to Google:
Google's spiders regularly crawl the web to rebuild our index. Crawls are based on many factors such as PageRank, links to a page, and crawling constraints such as the number of parameters in a URL. Any number of factors can affect the crawl frequency of individual sites.
Our crawl process is algorithmic; computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. We don't accept payment to crawl a site more frequently. For tips on maintaining a crawler-friendly website, please visit our Webmaster Guidelines.
Please let me know if you have any further questions or comments.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap.xml strategy for site with thousands of pages
I have a client that has a HUGE website with thousands of product pages. We don't currently have a sitemap.xml because it would take so much power to map the sitemap. I have thought about creating a sitemap for the key pages on the website - but didn't want to hurt the SEO on the thousands of product pages. If you have a sitemap.xml that only has some of the pages on your site - will it negatively impact the other pages, that Google has indexed - but are not listed on the sitemap.xml.
Technical SEO | | jerrico10 -
Sitemap: Linking horizontal pages on a sitemap that has a vertical hierarchy structure
I'm currently in the process of revamping a website and creating a sitemap for it so that all pages get indexed by search engines. The site is divided into two websites that share the same root domain. The marketing site is on example.com and the application is on go.example.com. To get to go.example.com from example.com, you need to go through one of three “action pages”. The action pages are accessed from every page on example.com where we have a CTA button on the site (that’s pretty much every page). These action pages do not link back to any other page on the site though, nor are they a necessary step to navigate to other webpages. These action pages are only viewed when a user is ready to be taken to the application site. My question is, how should these pages be set up in a vertical sitemap since these three pages have a horizontal structure? Any insight would be much appreciated!
Technical SEO | | RallyUp0 -
Removing indexed pages
Hi all, this is my first post so be kind 🙂 - I have a one page Wordpress site that has the Yoast plugin installed. Unfortunately, when I first submitted the site's XML sitemap to the Google Search Console, I didn't check the Yoast settings and it submitted some example files from a theme demo I was using. These got indexed, which is a pain, so now I am trying to remove them. Originally I did a bunch of 301's but that didn't remove them from (at least not after about a month) - so now I have set up 410's - These also seem to not be working and I am wondering if it is because I re-submitted the sitemap with only the index page on it (as it is just a single page site) could that have now stopped Google indexing the original pages to actually see the 410's?
Technical SEO | | Jettynz
Thanks in advance for any suggestions.0 -
Sitemap question
Hello, In your opinion what is better for a root domain and micro-sites using sub-domains?, to have a single sitemap for the root domain including all links to the sub-domains or to have a separate sitemap for each sub-domain? Thanks Arnold
Technical SEO | | arnoldwender0 -
Removing a staging area/dev area thats been indexed via GWT (since wasnt hidden) from the index
Hi, If you set up a brand new GWT account for a subdomain, where the dev area is located (separate from the main GWT account for the main live site) and remove all pages via the remove tool (by leaving the page field blank) will this definately not risk hurting/removing the main site (since the new subdomain specific gwt account doesn't apply to the main site in any way) ?? I have a new client who's dev area has been indexed, dev team has now prevented crawling of this subdomain but the 'the stable door was shut after the horse had already bolted' and the subdomains pages are on G's index so we need to remove the entire subdomain development area asap. So we are going to do this via the remove tool in a subdomain specific new gwt account, but I just want to triple check this wont accidentally get main site removed too ?? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Webmaster Tools vs Screaming from for 404's
Hey guys, I was just wondering which is better to use to find the 404's effecting your site. I have been using webmaster tools and just purchased screaming frog which has given me a totally different list of 404's compared to WMT. Which do I use, or do I use both? Cheers
Technical SEO | | Adamshowbiz0 -
Google not found errors in webmaster tool help
Hi, Google Webmaster tools sent me a few messages recently about the jump in the number of 'not found' errors. From 0 to 290 errors, ouch. I know what it's from but I think Google is seeing things. We developed another page/subdomain we're working on with links back to the root domain. Basically a complete list of articles page that lists each article and links back to the root domain. Not sure what Google is crawling but the links that would result in a 'not found' error aren't there. Will these disappear over time? Thanks for the help!
Technical SEO | | astahl110 -
Include pagination in sitemap.xml?
Curious on peoples thoughts around this. Since restructuring our site we have seen a massive uplift in pages indexed and organic traffic with our pagination. But we haven't yet included a sitemap.xml. It's an ancient site that never had one. Given that Google seems to be loving us right now, do we even need a sitemap.xml - aside from the analytical benefis in WM Tools? Would you include pagination URL's (don't worry, we have no duplicate content) in the sitemap.xml? Cheers.
Technical SEO | | sichristie0