Google Sitemap only indexing 50% Is that a problem?
-
We have about 18,000 pages submitted on our Google Sitemap and only about 9000 of them are indexed. Is this a problem?
We have a script that creates a sitemap on a daily basis and it is submitted on a daily basis. Am I better off only doing it once a week? Is this why I never get to the full 18,000 indexed?
-
My robots, tags and redirects are all good now. Any other things to look at?
-
Have you done some troubleshooting? If there's that much of a % change, did you check your robots, tags, redirects, etc. to see if any of the technical side may be hindering indexing?
-
It is a large e-commerce site with pretty much the exact situation described. We re did the site about 6 weeks ago and the site before was always close to 100% indexed. It was about 17900 out of 18000.
-
Great answer Donford. We have a large site, with many items that are basically the same but usually have one different attribute value. So Google will typical index a parent page and list the rest as:
Results 1 - 15 of 15 – Medium Duty - Swivel Top Plate - Capacity to 400 lbs ...
So even though the page may not be in the primary index, it will still help the visitor get to what they are looking for. So I would advise grabbing a snippet of text on a page not indexed and using it as a query to see if this is the case.
-
Google will index more as they find value in more links. The last ecommerce site I worked on had 12,000 pages as of the end of the year they were 85% indexed.
It is quite common from my experience for larger sites to take awhile to be fully indexed if ever at all. Here is what Goolge says about ensuring proper setup, but other then what they say, its all about content and uniqueness. A particular challenge for some e-commerce sites whom sell items that are similar in nature. Like 1/2"x1" screw vs 5/8" x 1" screw. Its very hard to develop unique content for items that similar.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Only Indexing Canonical Root URL Instead of Specified URL Parameters
We just launched a website about 1 month ago and noticed that Google was indexing, but not displaying, URLs with "?location=" parameters such as: http://www.castlemap.com/local-house-values/?location=great-falls-virginia and http://www.castlemap.com/local-house-values/?location=mclean-virginia. Instead, Google has only been displaying our root URL http://www.castlemap.com/local-house-values/ in its search results -- which we don't want as the URLs with specific locations are more important and each has its own unique list of houses for sale. We have Yoast setup with all of these ?location values added in our sitemap that has successfully been submitted to Google's Sitemaps: http://www.castlemap.com/buy-location-sitemap.xml I also tried going into the old Google Search Console and setting the "location" URL Parameter to Crawl Every URL with the Specifies Effect enabled... and I even see the two URLs I mentioned above in Google's list of Parameter Samples... but the pages are still not being added to Google. Even after Requesting Indexing again after making all of these changes a few days ago, these URLs are still displaying as Allowing Indexing, but Not On Google in the Search Console and not showing up on Google when I manually search for the entire URL. Why are these pages not showing up on Google and how can we get them to display? Only solution I can think of would be to set our main /local-house-values/ page to noindex in order to have Google favor all of our other URL parameter versions... but I'm guessing that's probably not a good solution for multiple reasons.
Intermediate & Advanced SEO | | Nitruc0 -
This url is not allowed for a Sitemap at this location error using pro-sitemaps.com
Hey, guys, We are using the pro-sitemaps.com tool to automate our sitemaps on our properties, but some of them give this error "This url is not allowed for a Sitemap at this location" for all the urls. Strange thing is that not all of them are with the error and most have all the urls indexed already. Do you have any experience with the tool and what is your opinion? Thanks
Intermediate & Advanced SEO | | lgrozeva0 -
Website penalised can't find where the problem is. Google went INSANE
Hello, I desperately need a hand here! Firstly I just want to say that I we never infracted google guidelines as far as we know. I have been around in this field for about 6 years and have had success with many websites on the way relying only in natural SEO and was never penalised until now. The problem is that our website www.turbosconto.it is and we have no idea why. (not manual) The web has been online for more than 6 months and it NEVER started to rank. it has about 2 organic visits a day at max. In this time we got several links from good websites which are related to our topic which actually keep sending us about 50 visits a day. Nevertheless our organic visita are still 1 or 2 a day. All the pages seem to be heavily penalised ... when you perform a search for any of our "shops"even including our Url, no entries for the domain appear. A search example: http://www.turbosconto.it zalando What I will expect to find as a result: http://www.turbosconto.it/buono-sconto-zalando The same case repeats for all of the pages for the "shops" we promote. Searching any of the brads + our domain shows no result except from "nike" and "euroclinix" (I see no relationship between these 2) Some days before for these same type of searches it was showing pages from the domain which we blocked by robots months ago, and which go to 404 error instead of our optimised landing pages which cannot be found in the first 50 results. These pages are generated by our rating system... We already send requests to de index all theses pages but they keep appearing for every new page that we create. And the real pages nowhere to be found... Here isan example: http://www.turbosconto.it/shops/codice-promozionale-pimkie/rat
Intermediate & Advanced SEO | | sebastiankoch
You can see how google indexes that for as in this search: site:www.turbosconto.it rate Why on earth will google show a page which is blocked by the robots.txt displaying that the content cannot retrieved because it is blocked by the robots instead of showing pages which are totally SEO Friendly and content rich... All the script from TurboSconto is the same one that we use in our spanish version www.turbocupones.com. With this last one we have awesome results, so it makes things even more weird... Ok apart from those weird issues with the indexation and the robots, why did a research on out backlinks and we where surprised to fin a few bad links that we never asked for. Never the less there are just a few and we have many HIGH QUALITY LINKS, which makes it hard to believe that this could be the reason. Just to be sure we, we used the disavow tool for these links, here are the bad links we submitted 2 days ago: domain: www.drilldown.it #we did not ask for this domain: www.indicizza.net #we did not ask for this domain: urlbook.in #we did not ask for this, moreover is a spammy one http://inpe.br.way2seo.org/domain-list-878 #we did not ask for this, moreover is a spammy one http://shady.nu.gomarathi.com/domain-list-789 #we did not ask for this, moreover is a spammy one http://www.clicdopoclic.it/2013/12/i-migliori-siti-italiani-di-coupon-e.html #we did not ask for this, moreover and is a copy of a post of an other blog http://typo.domain.bi/turbosconto.it I have no clue what can it be, we have no warning messages in the webmaster tools or anything.
For me it looks as if google has a BUG and went crazy on judging our italian website. Or perhaps we are just missing something ??? If anyone could throw some light on this I will be really glad and willing to pay some compensation for the help provided. THANKS A LOT!0 -
Hreflang Sitemap
Hi all, Have you ever created an hreflang sitemap using in-house resources or a third-party company for a group of over 70 sites, each with hundreds of pages and all with localised URLs? If so, would you mind sharing how you did it, or the contact details of the company you used? In this specific case there is nothing in the URL or code that I can use to group the alternatives automatically. Thanks, Carlos
Intermediate & Advanced SEO | | Carlos-R0 -
Best way to permanently remove URLs from the Google index?
We have several subdomains we use for testing applications. Even if we block with robots.txt, these subdomains still appear to get indexed (though they show as blocked by robots.txt. I've claimed these subdomains and requested permanent removal, but it appears that after a certain time period (6 months)? Google will re-index (and mark them as blocked by robots.txt). What is the best way to permanently remove these from the index? We can't use login to block because our clients want to be able to view these applications without needing to login. What is the next best solution?
Intermediate & Advanced SEO | | nicole.healthline0 -
Does Google Index an Alert Div w/Delayed Hide
We have a div at the top of a client's the page that displays an alert to the user. After 30 seconds it is rendered hidden. Does Google index this? Does Google take this into account when it ranks the page?
Intermediate & Advanced SEO | | WEOMedia0 -
Indexed Pages in Google, How do I find Out?
Is there a way to get a list of pages that google has indexed? Is there some software that can do this? I do not have access to webmaster tools, so hoping there is another way to do this. Would be great if I could also see if the indexed page is a 404 or other Thanks for your help, sorry if its basic question 😞
Intermediate & Advanced SEO | | JohnPeters0 -
Google Re-Index or multiple 301 Redirects on the server?
Over a year ago we moved a site from Blogspot that was adding dates in the URL's (i.e.. blog/2012/08/10/) Additionally we've removed category folders (/category, /tag, etc). Overall if I add all these redirects (from the multiple date options, etc) I'm concerned it might be an overload on the server? After talking with the server team they had suggested using something like 'BWP Google Sitemaps' on our Wordpress site, which would allow Google some time to re-index our site. What do you suggest we do?
Intermediate & Advanced SEO | | seointern0