Sitemaps and Indexed Pages
-
Hi guys,
I created an XML sitemap and submitted it for my client last month.
Now the developer of the site has also been messing around with a few things.
I've noticed on my Moz site crawl that indexed pages have dropped significantly.
Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed?
Thanks
David.
-
Thanks Eli!
I guess I was wondering if the MOZ Bot only followed pages that were in the sitemap. It was generated by Screaming Frog I have trusted it to include all relevant pages!
I have put in a more detailed description in the response below. Overall I need to investigate further but i'm satisfied that the sitemap has not caused the drop!
-
Thanks Martijn!
I guess I was wondering if the MOZ Bot only followed pages that were in the sitemap. It was generated by Screaming Frog I have trusted it to include all relevant pages!
To elaborate.
There were about 80,000 pages and I used canonical, no index, and redirects to clean up a rather large mess of filter URL's and dup content.
That dropped the pages to about 14k. Then I submitted the sitemap last month and now the crawl only found 4k pages.
Further investigation is needed on my behalf but I wanted to double check that this sudden drop was not because of a sitemap! Thanks for clarifying that!
-
Hi David,
Messing up, Changing or Updating, Deleting a Sitemap is not necessarily something that will decrease the number of ranked or crawled pages. It usually is used a signal to find new pages and figure out if old ones are deleted. But the chances that your sitemap have had a significant impact in what kind of pages went down is something I would find unlikely. It could happen though that you'd see the opposite, an increase in pages indexed/submitted/crawled after you submit a sitemap.
Martijn.
-
Hey David!
Thanks for reaching out to us!
Unfortunately I am not an SEO consultant / Web Developer so I cannot offer specific advice, but I'm sure there are loads of members here who would love to help and have a lot more knowledge than I do! A few things I have picked up which may help are the following:
Try to determine when the drop started, did it drop when you submitted the XML sitemap or when the developer changed certain things? This could help point to the reason for the drop in indexing. There are a variety of reasons as to why Google may not choose to index pages, however some of the common ones are:
-
Check your robots.txt to ensure those pages are still crawlable
-
Check for duplicate content / was there any canonical changes?
-
One of the tools you could use to help keep track of ranking fluctuations is mozcast (http://mozcast.com/). Was there turbulence in the Google algorithm when the indexed pages dropped significantly?
If you want us to have a look at your specific campaign to investigate further could you please pop an email over to help@moz.com.
Thanks!
Eli
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Authority 2.0 is coming soon!
UPDATE: We’ve made the tough decision to delay the launch of Page Authority 2.0 as our team investigates an unforeseen issue. **To learn more about the rigorous testing process and updates to the timeline, **check out this blog post. Hey there Moz family, We’re stoked to let you know about an upcoming change to a beloved metric — similar to our refresh of the Domain Authority algorithm last year, we’ve been working on developing an improvement to its sibling metric, Page Authority (also known as “PA”). Page Authority (PA) identifies the strength of a particular page (on a 1-100 scale) and its ability to rank in search results in comparison to other pages. PA is a Moz metric, and while it can be used as a good gauge of page strength and ranking potential, it is not used by search engines to determine ranking outcome. On September 30, 2020, we will release the new and improved Page Authority algorithm that will be updated in a similar fashion to last year’s update to DA. The updated algorithm will take into account more modern ranking factors to make the score even more accurate and predictive. We recognize that the update to the DA algorithm took time to communicate to clients and stakeholders, and we wanted to be sure to give you advance notice again this time around. We’ve created a number of resources to help you understand the what, the why, and the how of this update to PA. Let’s start with a few FAQs that you might be curious about! Why didn’t PA update when DA updated? Although many folks associate DA and PA with one another, the two metrics are calculated independently. We chose to update the two metrics separately in order to take the care that each metric deserved, and to provide the highest quality algorithm updates for the SEO community. Why is Moz changing the PA algorithm? As with our update to the DA algorithm, we want to ensure that you have confidence in our metrics and the predictions that they provide. Data integrity is an integral part of our tools and something that we hold in the highest regard. To be sure that PA can best reflect the potential for a page to rank on the SERP, we’re making the necessary improvements. What can I expect to see from the PA algorithm update? Many pages will see changes to their PA scores as a result of this algorithm update. While the changes to scores may be somewhat minimal, there is a possibility that some pages will see material change to their scores. The new PA algorithm takes into consideration Spam Score and link patterns, in addition to dozens of other factors, so your PA scores may see noticeable change if your pages have spammy or unnatural link patterns. How can I prepare for the update? As with any change to a metric that you know and love, we recommend getting in touch with your stakeholders to let them know of the upcoming update. For those who are used to seeing this metric in your SEO reports, giving them a heads-up will help them to prepare for any fluctuations they might see to PA scores once the new PA algorithm rolls out. We also recommend using this update as an opportunity to educate them on the use of Page Authority and how you might use this refreshed metric for future link building projects. Our resource center has a few helpful pieces of collateral that can support these conversations with stakeholders and clients. Is Page Authority an absolute score or a relative one? Page Authority should always be used as a relative metric, to compare the score of your pages to the scores of other sites’ pages. Link Explorer looks at over 7 trillion pages and 40 trillion links to inform the Page Authority metric that you see. As such, it is always a wise idea to use PA as a comparative score to understand where your page stacks up in comparison to the other pages that are present on the SERPs you care about. Will Domain Authority (DA) be impacted by this update? No, DA will not be affected by this update. This particular algorithm update is specific to Page Authority only. Will API users be affected at the same time? Yes, API users will see the update to Page Authority at the same time as users of Moz Pro. We’d love for you to check out our resource page for links to a slide deck, a whitepaper, and other helpful information. The full announcement blog post from Russ Jones can be found here. Happy to chat with you here in the Q&A thread, or feel free to send an email to help@moz.com with any questions. Best, Igor
API | | IgorJesovnik8 -
Why did the April Index Raise DA?
All of our websites DA raised dramatically, including the competitors we track Any idea why this may have happened across the board?
API | | Blue_Compass0 -
The April Index Update is Here!
Don’t adjust your monitors, or think this is an elaborate April Fool’s joke, we are actually releasing our April Index Update EARLY! We had planned to release our April Index Update on the 6th, but processing went incredibly smoothly and left us the ability to get it up today. Let’s dig into the details of the April Index Release: 138,919,156,028 (139 billion) URLs. 746,834,537 (747 million) subdomains. 190,170,132 (190 million) root domains. 1,116,945,451,603 (1.1 Trillion) links. Followed vs nofollowed links 3.02% of all links found were nofollowed 61.79% of nofollowed links are internal 38.21% are external Rel canonical: 28.14% of all pages employ the rel=canonical tag The average page has 90 links on it 73 internal links on average. 17 external links on average. Don’t let me hold you up, go dive into the data! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand:https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores
API | | IanWatson9 -
First Mozscape index of the year is live
I'm happy to announce, the first index of the year is out. We did have a smaller count of subdomains, but correlations are generally up and coverage of what's in Google looks better, too. We're giving that one a high five! We've (hopefully) removed a lot of foreign and spam subdomains, which you might see reflected in your spam links section. (another woot!) Here are some details about this index release: 145,549,223,632 (145 billion) URLs 1,356,731,650 (1 billion) subdomains 200,255,095 (200 million) root domains 1,165,625,349,576 (1.1 Trillion) links Followed vs nofollowed links 3.17% of all links found were nofollowed 63.49% of nofollowed links are internal 36.51% are external Rel canonical: 26.50% of all pages employ the rel=canonical tag The average page has 89 links on it 72 internal links on average 17 external links on average Thanks! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand: https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores.
API | | jennita5 -
Mozscape Index
Hello: There was a Mozscape Index scheduled 9/8/2015 and now it go pushed back October 8,2015. There seems to be a lot of delays with the Mozscape Index. Is this something we should expect? Updates every 2 months instead of every month? Thanks!
API | | sderuyter1 -
Have Questions about the Jan. 27th Mozscape Index Update? Get Answers Here!
Howdy y'all. I wanted to give a brief update (not quite worthy of a blog post, but more than would fit in a tweet) about the latest Mozscape index update. On January 27th, we released our largest web index ever, with 285 Billion unique URLs, and 1.25 Trillion links. Our previous index was also a record at 217 Billion pages, but this one is another 30% bigger. That's all good news - it means more links that you're seeking are likely to be in this index, and link counts, on average, will go up. There are two oddities about this index, however, that I should share: The first is that we broke one particular view of data - 301'ing links sorted by Page Authority doesn't work in this index, so we've defaulted to sorting 301s by Domain Authority. That should be fixed in the next index, and from our analytics, doesn't appear to be a hugely popular view, so it shouldn't affect many folks (you can always export to CSV and re-sort by PA in Excel if you need, too - note that if you have more than 10K links, OSE will only export the first 10K, so if you need more data, check out the API). The second is that we crawled a massively more diverse set of root domains than ever before. Whereas our previous index topped out at 192 million root domains, this latest one has 362 million (almost 1.9X as many unique, new domains we haven't crawled before). This means that DA and PA scores may fluctuate more than usual, as link diversity are big parts of those calculations and we've crawled a much larger swath of the deep, dark corners of the web (and non-US/non-.com domains, too). It also means that, for many of the big, more important sites on the web, we are crawling a little less deeply than we have in the past (the index grew by ~31% while the root domains grew by ~88%). Often, those deep pages on large sites do more internal than external linking, so this might not have a big impact, but it could depend on your field/niche and where your links come from. As always, my best suggestion is to make sure to compare your link data against your competition - that's a great way to see how relative changes are occurring and whether, generally speaking, you're losing or gaining ground in your field. If you have specific questions, feel free to leave them and I'll do my best to answer in a timely fashion. Thanks much! p.s. You can always find information about our index updates here.
API | | randfish8 -
Does on-page grader have an API ?
Hi, I would very much like to include the on-page grader output into my SEO tools. Is there an API for that? thanks James
API | | KMdayJob0 -
Huge in crease in on page errors
Hi guys I’ve just checked my online campaign and I see errors in my crawl diagnostics have almost doubled from the 21<sup>st</sup> of October to the 25<sup>th</sup> of October going from 6708 errors to 11 599. Can anyone tell me what may have caused this? Also I notice we have a lot of issues with duplicate page titles which seems strange as no new pages have been added, can anyone explain why this might be? I look forward to hearing from you
API | | Hardley1110