Google Indexing of Site Map
-
We recently launched a new site - on June 4th we submitted our site map to google and almost instantly had all 25,000 URL's crawled (yay!).
On June 18th, we made some updates to the title & description tags for the majority of pages on our site and added new content to our home page so we submitted a new sitemap.
So far the results have been underwhelming and google has indexed a very low number of the updated pages. As a result, only a handful of the new titles and descriptions are showing up on the SERP pages.
Any ideas as to why this might be? What are the tricks to having google re-index all of the URLs in a sitemap?
-
No problem, its actually really easy:
https://www.google.com/webmasters/tools/googlebot-fetch
Once you have selected your account, add the URL and then submit to index. I would do the homepage first and for that page, use the "Crawl this URL and its direct links" option. Then for the subpages do the "Crawl only this URL" option. It can also help to do the "Crawl this URL and its direct links" for any of your top level menu items to help speed things up.
"For example, i just checked a page and saw that some images weren't being indexed." Does your robots file allow specific access to those pages? If not, here is how you can set it to do so. This will also allow Google's partners to access your images. Add this to the bottom of your robots file:
User-agent: Googlebot-Image
Allow: /images/
User-agent: Adsbot-Google
Allow: /
User-agent: Googlebot-Mobile
Allow: /
User-agent: Mediapartners-Google*
Allow: /
Sitemap: http://www.YOURSITEHERE.com/sitemap.xml -
Thank you!! I'll take a look through the google resource. Also the site:domain search reviled 35,000 results.
The results are there, just not reindexed.
-
David,
Thanks for your response. This is exactly what we've seen with the initial spike in ranking and now with things settling down. I'll make sure the team has the crawl requests to daily (which I think it is).
For fetch as google - what's the best way that you've used this? For example, i just checked a page and saw that some images weren't being indexed. If I correct the issue, can I just use "Submit to Index"?
Thanks!!!!
-
In the 1000's of sites we have submitted, all show an initial spike in ranking and indexing before things settle down for the long haul. It seems like Google does a "best guess" scenario, before they take the time to fully crawl and analyze all of the URL's and rank them accordingly. As always, resubmit the pages through all webmaster tools (Bing too!) so that they are always aware of the most recent updates. If you are planning on updating the pages frequently, I would edit your crawl request to daily in your sitemap. They probably won't do it anyway, but you can try
Use the fetch as Google religiously when you update. It is your friend
-
Hi there
Did you read through Google's indexing resources?
I would also try doing a quick "site:yourdomain.com" and see how many pages Google pulls up - that's a more accurate representation of what's indexed from your site. This is reflected in the resource above:
"Sometimes the data we show in Index Status is not fully reflected in Google Search results." I suggest reading through the resource and also performing that search. Google indexing your sitemap is a waiting game, you're on the watch, just be patient!
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My WP website got attack by malware & now my website site:www.example.ca shows about 43000 indexed page in google.
Hi All My wordpress website got attack by malware last week. It affected my index page in google badly. my typical site:example.ca shows about 130 indexed pages on google. Now it shows about 43000 indexed pages. I had my server company tech support scan my site and clean the malware yesterday. But it still shows the same number of indexed page on google. Does anybody had ever experience such situation and how did you fixed it. Looking for help. Thanks FILE HIT LIST:
Technical SEO | | Chophel
{YARA}Spam_PHP_WPVCD_ContentInjection : /home/example/public_html/wp-includes/wp-tmp.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-includes/wp-vcd.php
{YARA}Backdoor_PHP_WPVCD_Deployer : /home/example/public_html/wp-content/themes/oceanwp.zip
{YARA}webshell_webshell_cnseay02_1 : /home/example2/public_html/content.php
{YARA}eval_post : /home/example2/public_html/wp-includes/63292236.php
{YARA}webshell_webshell_cnseay02_1 : /home/example3/public_html/content.php
{YARA}eval_post : /home/example4/public_html/wp-admin/28855846.php
{HEX}php.generic.malware.442 : /home/example5/public_html/wp-22.php
{HEX}php.generic.cav7.421 : /home/example5/public_html/SEUN.php
{HEX}php.generic.malware.442 : /home/example5/public_html/Webhook.php0 -
Over 40+ pages have been removed from the indexed and this page has been selected as the google preferred canonical.
Over 40+ pages have been removed from the indexed and this page has been selected as the google preferred canonical. https://studyplaces.com/about-us/ The pages affected by this include: https://studyplaces.com/50-best-college-party-songs-of-all-time-and-why-we-love-them/ https://studyplaces.com/15-best-minors-for-business-majors/ As you can see the content on these pages is totally unrelated to the content on the about-us page. Any ideas why this is happening and how to resolve.
Technical SEO | | pnoddy0 -
Google indexes page elements
Hello We face this problem that Google indexes page elements from WordPress as single pages. How can we prevent these elements from being indexed separately and being displayed in the search results? For example this project: www.rovana.be When scrolling down the search results, there are a lot of elements that are indexed separately. When clicking on the link, this is wat we see (see attachements) Does anyone have experience with this way of indexing and how can we solve this problem? Thanks! LlAWG4w.png C7XDDYS.png gVroomx.png
Technical SEO | | conversal0 -
Some of my website urls are not getting indexed while checking (site: domain) in google
Some of my website urls are not getting indexed while checking (site: domain) in google
Technical SEO | | nlogix0 -
Get List Of All Indexed Google Pages
I know how to run site:domain.com but I am looking for software that will put these results into a list and return server status (200, 404, etc). Anyone have any tips?
Technical SEO | | InfinityTechnologySolutions0 -
How can I tell Google not to index a portion of a webpage?
I'm working with an ecommerce site that has many product descriptions for various brands that are important to have but are all straight duplicates. I'm looking for some type of tag tht can be implemented to prevent Google from seeing these as duplicates while still allowing the page to rank in the index. I thought I had found it with Googleoff, googleon tag but it appears that this is only used with the google appliance hardware.
Technical SEO | | bradwayland0 -
How a google bot sees your site
So I have stumbled across various websites like this: http://www.smart-it-consulting.com/internet/google/googlebot-spoofer/ The concept here is to be able to view your site as a googlebot sees it. However, the results are a little puzzling. Google is reading the text on my page but not the title tags according to the results. Are websites like this accurate OR does Google not read title tags and H1 tags anymore? Also on a slighly related note. I noticed the results show the navigation bar is being read first by google, is this bad and should the navigation bar be optimized for keywords as well? If it did, it would read a bit funny and the "humans" would be confused.
Technical SEO | | StreetwiseReports0 -
Dynamically-generated .PDF files, instead of normal pages, indexed by and ranking in Google
Hi, I come across a tough problem. I am working on an online-store website which contains the functionlaity of viewing products details in .PDF format (by the way, the website is built on Joomla CMS), now when I search my site's name in Google, the SERP simply displays my .PDF files in the first couple positions (shown in normal .PDF files format: [PDF]...)and I cannot find the normal pages there on SERP #1 unless I search the full site domain in Google. I really don't want this! Would you please tell me how to figure the problem out and solve it. I can actually remove the corresponding component (Virtuemart) that are in charge of generating the .PDF files. Now I am trying to redirect all the .PDF pages ranking in Google to a 404 page and remove the functionality, I plan to regenerate a sitemap of my site and submit it to Google, will it be working for me? I really appreciate that if you could help solve this problem. Thanks very much. Sincerely SEOmoz Pro Member
Technical SEO | | fugu0