Google Indexing of Site Map
-
We recently launched a new site - on June 4th we submitted our site map to google and almost instantly had all 25,000 URL's crawled (yay!).
On June 18th, we made some updates to the title & description tags for the majority of pages on our site and added new content to our home page so we submitted a new sitemap.
So far the results have been underwhelming and google has indexed a very low number of the updated pages. As a result, only a handful of the new titles and descriptions are showing up on the SERP pages.
Any ideas as to why this might be? What are the tricks to having google re-index all of the URLs in a sitemap?
-
No problem, its actually really easy:
https://www.google.com/webmasters/tools/googlebot-fetch
Once you have selected your account, add the URL and then submit to index. I would do the homepage first and for that page, use the "Crawl this URL and its direct links" option. Then for the subpages do the "Crawl only this URL" option. It can also help to do the "Crawl this URL and its direct links" for any of your top level menu items to help speed things up.
"For example, i just checked a page and saw that some images weren't being indexed." Does your robots file allow specific access to those pages? If not, here is how you can set it to do so. This will also allow Google's partners to access your images. Add this to the bottom of your robots file:
User-agent: Googlebot-Image
Allow: /images/
User-agent: Adsbot-Google
Allow: /
User-agent: Googlebot-Mobile
Allow: /
User-agent: Mediapartners-Google*
Allow: /
Sitemap: http://www.YOURSITEHERE.com/sitemap.xml -
Thank you!! I'll take a look through the google resource. Also the site:domain search reviled 35,000 results.
The results are there, just not reindexed.
-
David,
Thanks for your response. This is exactly what we've seen with the initial spike in ranking and now with things settling down. I'll make sure the team has the crawl requests to daily (which I think it is).
For fetch as google - what's the best way that you've used this? For example, i just checked a page and saw that some images weren't being indexed. If I correct the issue, can I just use "Submit to Index"?
Thanks!!!!
-
In the 1000's of sites we have submitted, all show an initial spike in ranking and indexing before things settle down for the long haul. It seems like Google does a "best guess" scenario, before they take the time to fully crawl and analyze all of the URL's and rank them accordingly. As always, resubmit the pages through all webmaster tools (Bing too!) so that they are always aware of the most recent updates. If you are planning on updating the pages frequently, I would edit your crawl request to daily in your sitemap. They probably won't do it anyway, but you can try
Use the fetch as Google religiously when you update. It is your friend
-
Hi there
Did you read through Google's indexing resources?
I would also try doing a quick "site:yourdomain.com" and see how many pages Google pulls up - that's a more accurate representation of what's indexed from your site. This is reflected in the resource above:
"Sometimes the data we show in Index Status is not fully reflected in Google Search results." I suggest reading through the resource and also performing that search. Google indexing your sitemap is a waiting game, you're on the watch, just be patient!
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How preproduction website is getting indexed in Google.
Hi team, Can anybody please help me to find how my preproduction website and urls are getting indexed in Google.
Technical SEO | | nlogix0 -
Google Search Console Site Map Anomalies (HTTP vs HTTPS)
Hi I've just done my usual Monday morning review of clients Google Search Console (previously Webmaster Tools) dashboard and disturbed to see that for 1 client the Site Map section is reporting 95 pages submitted yet only 2 indexed (last time i looked last week it was reporting an expected level of indexed pages) here. It says the sitemap was submitted on the 10th March and processed yesterday. However in the 'Index Status' its showing a graph of growing indexed pages up to & including yesterday where they numbered 112 (so looks like all pages are indexed after all). Also the 'Crawl Stats' section is showing 186 pages crawled on the 26th. Then its listing sub site-maps all of which are non HTTPS (http) which seems very strange since the site is HTTPS and has been for a few months now and the main sitemap index url is an HTTPS: https://www.domain.com/sitemap_index.xml The sub sitemaps are:http://www.domain.com/marketing-sitemap.xmlhttp://www.domain.com/page-sitemap.xmlhttp://www.domain.com/post-sitemap.xmlThere are no 'Sitemap Errors' reported but there are 'Index Error' warnings for the above post-sitemap, copied below:_"When we tested a sample of the URLs from your Sitemap, we found that some of the URLs were unreachable. Please check your webserver for possible misconfiguration, as these errors may be caused by a server error (such as a 5xx error) or a network error between Googlebot and your server. All reachable URLs will still be submitted." _
Technical SEO | | Dan-Lawrence
Also for the below site map URL's: "Some URLs listed in this Sitemap have a high response time. This may indicate a problem with your server or with the content of the page" for:http://domain.com/en/post-sitemap.xmlANDhttps://www.domain.com/page-sitemap.xmlAND https://www.domain.com/post-sitemap.xmlI take it from all the above that the HTTPS sitemap is mainly fine and despite the reported 0 pages indexed in GSC sitemap section that they are in fact indexed as per the main 'Index Status' graph and that somehow some HTTP sitemap elements have been accidentally attached to the main HTTPS sitemap and the are causing these problems.What's best way forward to clean up this mess ? Resubmitting the HTTPS site map sounds like right option but seeing as the master url indexed is an https url cant see it making any difference until the http aspects are deleted/removed but how do you do that or even check that's what's needed ? Or should Google just sort this out eventually ? I see the graph in 'Crawl > Sitemaps > WebPages' is showing a consistent blue line of submitted pages but the red line of indexed pages drops to 0 for 3 - 5 days every 5 days or so. So fully indexed pages being reported for 5 day stretches then zero for a few days then indexed for another 5 days and so on ! ? Many ThanksDan0 -
Google index graph duration in Google Webmaster Tools
Hello guys, I wonder, my sites are currently being indexed every 7 days, exactly. At Index Status page in GWT. However, this new site gets updated almost everyday, how can I ask google to index faster and more frequently/almost daily? Is it about SItemap.xml frequency ? I changed it today to Daily. Thanks!
Technical SEO | | mdmoz0 -
Dev Site Was Indexed By Google
Two of our dev sites(subdomains) were indexed by Google. They have since been made private once we found the problem. Should we take another step to remove the subdomain through robots.txt or just let it ride out? From what I understand, to remove the subdomain from Google we would verify the subdomain on GWT, then give the subdomain it's own robots.txt and disallow everything. Any advice is welcome, I just wanted to discuss this before making a decision.
Technical SEO | | ntsupply0 -
Searching in Google using the Site:www.example.com specification - is it in an order?
Hi Gurus, Just a quick searching question. If you do a Google search using the site: specification eg. site:www.example.com Is the list returned by Google in an order of something similar to 'Page Authority' or some other order eg. page first seen date etc. Because you are looking at your single site, is Google listing your pages back to you in it's perceived order of current 'popularity'? Thanks, Brad
Technical SEO | | BM70 -
Will using http ping, lastmod increase our indexation with Google?
If Google knows about our sitemaps and they’re being crawled on a daily basis, why should we use the http ping and /or list the index files in our robots.txt? Is there a benefit (i.e. improving indexability) to using both ping and listing index files in robots? Is there any benefit to listing the index sitemaps in robots if we’re pinging? If we provide a decent <lastmod>date is there going to be any difference in indexing rates between ping and the normal crawl that they do today?</lastmod> Do we need to all to cover our bases? thanks Marika
Technical SEO | | marika-1786190 -
What pages of my site does Google rank as the most important?
If I type site:youtube.com into Google, are the results listed by what Google considers to be the most important pages of the site? If I change my sitemap should this order change? Thanks!
Technical SEO | | Seaward-Group0 -
Does Google take user site blockings from Chrome as a spam signal?
When you perform a search in Chrome, click through to a result, then hit "back", you get a nice little option to "Block all example.com results" listed next to the result from which you backed out. I am assuming Google collects this information from Chrome users whose settings allow them to? I am assuming this is a spam signal (in aggregate)? Anyone know? Thanks!
Technical SEO | | TheEspresseo0