Sitemaps and Indexed Pages
-
Hi guys,
I created an XML sitemap and submitted it for my client last month.
Now the developer of the site has also been messing around with a few things.
I've noticed on my Moz site crawl that indexed pages have dropped significantly.
Before I put my foot in it, I need to figure out if submitting the sitemap has caused this.. can a sitemap reduce the pages indexed?
Thanks
David.
-
Thanks Eli!
I guess I was wondering if the MOZ Bot only followed pages that were in the sitemap. It was generated by Screaming Frog I have trusted it to include all relevant pages!
I have put in a more detailed description in the response below. Overall I need to investigate further but i'm satisfied that the sitemap has not caused the drop!
-
Thanks Martijn!
I guess I was wondering if the MOZ Bot only followed pages that were in the sitemap. It was generated by Screaming Frog I have trusted it to include all relevant pages!
To elaborate.
There were about 80,000 pages and I used canonical, no index, and redirects to clean up a rather large mess of filter URL's and dup content.
That dropped the pages to about 14k. Then I submitted the sitemap last month and now the crawl only found 4k pages.
Further investigation is needed on my behalf but I wanted to double check that this sudden drop was not because of a sitemap! Thanks for clarifying that!
-
Hi David,
Messing up, Changing or Updating, Deleting a Sitemap is not necessarily something that will decrease the number of ranked or crawled pages. It usually is used a signal to find new pages and figure out if old ones are deleted. But the chances that your sitemap have had a significant impact in what kind of pages went down is something I would find unlikely. It could happen though that you'd see the opposite, an increase in pages indexed/submitted/crawled after you submit a sitemap.
Martijn.
-
Hey David!
Thanks for reaching out to us!
Unfortunately I am not an SEO consultant / Web Developer so I cannot offer specific advice, but I'm sure there are loads of members here who would love to help and have a lot more knowledge than I do! A few things I have picked up which may help are the following:
Try to determine when the drop started, did it drop when you submitted the XML sitemap or when the developer changed certain things? This could help point to the reason for the drop in indexing. There are a variety of reasons as to why Google may not choose to index pages, however some of the common ones are:
-
Check your robots.txt to ensure those pages are still crawlable
-
Check for duplicate content / was there any canonical changes?
-
One of the tools you could use to help keep track of ranking fluctuations is mozcast (http://mozcast.com/). Was there turbulence in the Google algorithm when the indexed pages dropped significantly?
If you want us to have a look at your specific campaign to investigate further could you please pop an email over to help@moz.com.
Thanks!
Eli
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Authority 2.0 is coming soon!
UPDATE: We’ve made the tough decision to delay the launch of Page Authority 2.0 as our team investigates an unforeseen issue. **To learn more about the rigorous testing process and updates to the timeline, **check out this blog post. Hey there Moz family, We’re stoked to let you know about an upcoming change to a beloved metric — similar to our refresh of the Domain Authority algorithm last year, we’ve been working on developing an improvement to its sibling metric, Page Authority (also known as “PA”). Page Authority (PA) identifies the strength of a particular page (on a 1-100 scale) and its ability to rank in search results in comparison to other pages. PA is a Moz metric, and while it can be used as a good gauge of page strength and ranking potential, it is not used by search engines to determine ranking outcome. On September 30, 2020, we will release the new and improved Page Authority algorithm that will be updated in a similar fashion to last year’s update to DA. The updated algorithm will take into account more modern ranking factors to make the score even more accurate and predictive. We recognize that the update to the DA algorithm took time to communicate to clients and stakeholders, and we wanted to be sure to give you advance notice again this time around. We’ve created a number of resources to help you understand the what, the why, and the how of this update to PA. Let’s start with a few FAQs that you might be curious about! Why didn’t PA update when DA updated? Although many folks associate DA and PA with one another, the two metrics are calculated independently. We chose to update the two metrics separately in order to take the care that each metric deserved, and to provide the highest quality algorithm updates for the SEO community. Why is Moz changing the PA algorithm? As with our update to the DA algorithm, we want to ensure that you have confidence in our metrics and the predictions that they provide. Data integrity is an integral part of our tools and something that we hold in the highest regard. To be sure that PA can best reflect the potential for a page to rank on the SERP, we’re making the necessary improvements. What can I expect to see from the PA algorithm update? Many pages will see changes to their PA scores as a result of this algorithm update. While the changes to scores may be somewhat minimal, there is a possibility that some pages will see material change to their scores. The new PA algorithm takes into consideration Spam Score and link patterns, in addition to dozens of other factors, so your PA scores may see noticeable change if your pages have spammy or unnatural link patterns. How can I prepare for the update? As with any change to a metric that you know and love, we recommend getting in touch with your stakeholders to let them know of the upcoming update. For those who are used to seeing this metric in your SEO reports, giving them a heads-up will help them to prepare for any fluctuations they might see to PA scores once the new PA algorithm rolls out. We also recommend using this update as an opportunity to educate them on the use of Page Authority and how you might use this refreshed metric for future link building projects. Our resource center has a few helpful pieces of collateral that can support these conversations with stakeholders and clients. Is Page Authority an absolute score or a relative one? Page Authority should always be used as a relative metric, to compare the score of your pages to the scores of other sites’ pages. Link Explorer looks at over 7 trillion pages and 40 trillion links to inform the Page Authority metric that you see. As such, it is always a wise idea to use PA as a comparative score to understand where your page stacks up in comparison to the other pages that are present on the SERPs you care about. Will Domain Authority (DA) be impacted by this update? No, DA will not be affected by this update. This particular algorithm update is specific to Page Authority only. Will API users be affected at the same time? Yes, API users will see the update to Page Authority at the same time as users of Moz Pro. We’d love for you to check out our resource page for links to a slide deck, a whitepaper, and other helpful information. The full announcement blog post from Russ Jones can be found here. Happy to chat with you here in the Q&A thread, or feel free to send an email to help@moz.com with any questions. Best, Igor
API | | IgorJesovnik8 -
Crawler unable to access pages
Hi crawler is unable to access site and crawl properly. Mainly for the backlink checker, it's producing no results There is nothing in the robots.txt file blocking crawler access. Any help is much appreciated as it's driving me crazy!
API | | 2Cubedie0 -
Still not got any index update data.
Is anyone finding that they haven't got the results of the update yet? I have tried some competitors and they are not updated either.
API | | AHC_SEO0 -
Mozscape Index update frequency problems?
I'm new to Moz, only a member for a couple months now. But I already rely heavily on the mozscape index data for link building, as I'm sure many people do. I've been waiting for the latest update (due today after delay), but am not seeing any mention of the data yet - does it normally get added later in the day? I'm not that impatient that I can't wait until later today or tomorrow for this index update, but what I am curious about is whether Moz is struggling to keep up, and if updates will continue to get more and more rare? For example, in 2013 I count 28 index updates. In 2014 that number dropped to 14 updates (50% drop). In 2015, there was only 8 (another 43% drop), and so far this year (until the March 2nd update is posted) there has only been 1. This isn't just a complaint about updates, I'm hoping to get input from some of the more experienced Moz customers to better understand (with the exception of the catastrophic drive failure) the challenges that Moz is facing and what the future may hold for update frequency.
API | | kevin.kembel1 -
First Mozscape index of the year is live
I'm happy to announce, the first index of the year is out. We did have a smaller count of subdomains, but correlations are generally up and coverage of what's in Google looks better, too. We're giving that one a high five! We've (hopefully) removed a lot of foreign and spam subdomains, which you might see reflected in your spam links section. (another woot!) Here are some details about this index release: 145,549,223,632 (145 billion) URLs 1,356,731,650 (1 billion) subdomains 200,255,095 (200 million) root domains 1,165,625,349,576 (1.1 Trillion) links Followed vs nofollowed links 3.17% of all links found were nofollowed 63.49% of nofollowed links are internal 36.51% are external Rel canonical: 26.50% of all pages employ the rel=canonical tag The average page has 89 links on it 72 internal links on average 17 external links on average Thanks! PS - For any questions about DA/PA fluctuations (or non-fluctuations) check out this Q&A thread from Rand: https://moz.com/community/q/da-pa-fluctuations-how-to-interpret-apply-understand-these-ml-based-scores.
API | | jennita5 -
10/14 Mozscape Index Update Details
Howdy gang, As you might have seen, we've finally been able to update the Mozscape index after many challenging technical problems in the last 40 days. However, this index has some unique qualities (most of them not ideal) that I should describe. First, this index still contains data crawled up to 100 days ago. We try to make sure that what we've crawled recently is stuff that we believe has been updated/changed, but there may be sites and pages that have changed significantly in that period that we didn't update (due to issues I've described here previously with our crawlers & schedulers). Second, many PA/DA and other metric scores will look very similar to the last index because we lost and had problems with some metrics in processing (and believe that much of what we calculated may have been erroneous). We're using metrics from the prior index (which had good correlations with Google, etc) until we can feel confident that the new ones we're calculating are correct. That should be finished by the next index, which, also, should be out much faster than this one (more on that below). Long story short on this one - if your link counts went up and you're seeing much better/new links pointing to you, but DA/PA remain unchanged, don't panic - that's due to problems on our end with calculations and will be remedied in the next index. Third - the good news is that we've found and fixed a vast array of issues (many of them hiding behind false problems we thought we had), and we now believe we'll be able to ship the next index with greater quality, greater speed, and better coverage. One thing we're now doing is taking every URL we've ever seen in Google's SERPs (via all our rank tracking, SERPscape, the corpus for the upcoming KW Explorer product, etc) and prioritizing them in Mozscape's crawl, so we expect to be matching what Google sees a bit more closely in future indices. My apologies for the delay in getting this post up - I was on a plane to London for Searchlove - should have got it up before I left.
API | | randfish4 -
Bulk Page Authority Tracking
Hi Is there a way in Moz to identify your page authority by landing page, possibly crawling the site and providing this in bulk so you don't have to go through and check each page? I want to track how my page authority for certain pages moves over time. Thank you
API | | BeckyKey0