Sitemap Indexed vs. Submitted
-
My sitemap has been submitted to Google for well over 6 months and is updated frequently, a total of 979 URLs have been submitted by only 145 indexed. What can I do to get Google to index them all?
-
SF finding 'useless' links is actually part of its purpose, if you believe they're useless you should be asking why they're there. Your XML sitemap should have nothing but clean URLs; 200 response codes and not canonicalized to another URL. The problem isn't that you have category URLs, it's that those (like the one in my previous example) have a canonical tag that points elsewhere. Anytime this is the case, the URL is considered un-indexable. You can see the proof of this by doing a Google search for "https://www.interstellarstore.com/meteorite-jewelry/meteorite-necklaces", I just checked and this URL isn't in the index.
You mentioned the age in your original comment that your XML sitemap had been submitted for well over 6 months, that's where I got the age from, maybe I misunderstood?
You have no reason to not trust SF, it's one of the most valuable tools in an SEO's toolbox. I've used it for 5+ years to create hundreds of sitemaps and countless other SEO tasks with no problem in providing reliable, accurate data points.
-
Hi Logan,
I tried using Screaming Frog but it kept finding useless links, so I wrote the sitemap myself and I update it manually, I updated it only this morning. What makes you think it is over 6 months since an update?
I was told on Moz in an earlier post that having all of the category links, not just the canonical ones, wasn't a problem, is this not the case?
Every link in the sitemap should work fine, I wrote it by copy and pasting the links directly from my site. I have no trust in Screaming Frog.
-
Hi,
I poked around a bit on your sitemap and noticed a couple things:
- You've got URLs on there that have canonicals to another page. For example:This page https://www.interstellarstore.com/meteorite-jewelry/meteorite-necklaces has a canonical tag that points here https://www.interstellarstore.com/meteorite-necklaces.
- A bunch of the URLs in your sitemap redirect elsewhere or have no response - I got 13% through crawling your XML sitemap with Screaming Frog and there were zero 200 response code URLs, not good.
Both of these things combined are causing a discrepancy in the amount of submitted URLs vs. indexed URLs. If you use Screaming Frog to create your XML sitemap it's quite easy to have only clean URLs in there. You can easily remove all URLs that are not 200 status and by default Screaming Frog will exclude any URL that canonicalizes to another URL.
Also, as a side note, you should be updating your XML sitemap more frequently, a 6 month old sitemap for an ecommerce site is far too old with new products being added and products dropping off.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Home page suddenly dropped from index!!
A client's home page, which has always done very well, has just dropped out of Google's index overnight!
Intermediate & Advanced SEO | | Caro-O
Webmaster tools does not show any problem. The page doesn't even show up if we Google the company name. The Robot.txt contains: Default Flywheel robots file User-agent: * Disallow: /calendar/action:posterboard/
Disallow: /events/action~posterboard/ The only unusual thing I'm aware of is some A/B testing of the page done with 'Optimizely' - it redirects visitors to a test page, but it's not a 'real' redirect in that redirect checker tools still see the page as a 200. Also, other pages that are being tested this way are not having the same problem. Other recent activity over the last few weeks/months includes linking to the page from some of our blog posts using the page topic as anchor text. Any thoughts would be appreciated.
Caro0 -
Page Count in Webmaster Tools Index Status Versus Page Count in Webmaster Tools Sitemap
Greeting MOZ Community: I run www.nyc-officespace-leader.com, a real estate website in New York City. The page count in Google Webmaster Tools Index status for our site is 850. The page count in our Webmaster Tools Sitemap is 637. Why is there a discrepancy between the two? What does the Google Webmaster Tools Index represent? If we filed a removal request for pages we did not want indexed, will these pages still show in the Google Webmaster Tools page count despite the fact that they no longer display in search results? The number of pages displayed in our Google Webmaster Tools Index remains at about 850 despite the removal request. Before a site upgrade in June the number of URLs in the Google Webmaster Tools Index and Google Webmaster Site Map were almost the same. I am concerned that page bloat has something to do with a recent drop in ranking. Thanks everyone!! Alan
Intermediate & Advanced SEO | | Kingalan10 -
Why is this site not indexed by Google?
Hi all and thanks for your help in advance. I've been asked to take a look at a site, http://www.yourdairygold.ie as it currently does not appear for its brand name, Your Dairygold on Google Ireland even though it's been live for a few months now. I've checked all the usual issues such as robots.txt (doesn't have one) and the robots meta tag (doesn't have them). The even stranger thing is that the site does rank on Yahoo! and Bing. Google Webmaster Tools shows that Googlebot is crawling around 150 pages a day but the total number of pages indexed is zero. It does appear if you carry out a site: search on Google however. The site is very poorly optimised in terms of title tags, unnecessary redirects etc which I'm working on now but I wondered if you guys had any further insights. Thanks again for your help.
Intermediate & Advanced SEO | | iProspect-Ireland0 -
Correct Syntax for Meta No Index
Hi, Is this syntax ok for the bots, especially Google to pick up? Still waiting for Google to drop lots of duplicate pages caused by parameters out of the index - if there are parameters in the querystring it inserts the code above into the head. Thanks!
Intermediate & Advanced SEO | | bjs20101 -
Sitemaps recommend by google
Google in it guideline recommends to create a sitemap. Do they means a /sitemap.xml or does it need to be sitemap directly on the website ? Does it make any difference ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Canonical URLs and Sitemaps
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external). Questions: 1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags? 2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
Intermediate & Advanced SEO | | 379seo0 -
Google indexing flash content
Hi Would googles indexing of flash content count towards page content? for example I have over 7000 flash files, with 1 unique flash file per page followed by a short 2 paragraph snippet, would google count the flash as content towards the overall page? Because at the moment I've x-tagged the roberts with noindex, nofollow and no archive to prevent them from appearing in the search engines. I'm just wondering if the google bot visits and accesses the flash file it'll get the x-tag noindex, nofollow and then stop processing. I think this may be why the panda update also had an effect. thanks
Intermediate & Advanced SEO | | Flapjack0