Sitemap.xml problem in Google webmaster
-
Hi,
My sitemap.xml is not submitting correctly in Google Webmaster.
There is 697 url submitted but only 56 are in Google index.
At the top of webmaster this is what it says ->>>
http://www.example.com/sitemap.xml has been resubmitted.
But when when I clicked status button RED X occurs.
Any suggestions about this, thanks...
-
Cheers for your reply and answer
& Yes most of your assumptions were correct I am using sitemap generation. The issue is fixed there was a problem with the sitmap when created but it's all sorted now & submitted correctly in WMT.
Thanks...
-
Cheers for your reply and answer
& Yes most of your assumptions were correct I am using sitemap generation. The issue is fixed there was a problem with the sitmap when created but it's all sorted now & submitted correctly in WMT.
Thanks...
-
For the 8 invalid pages, you need to fix the URLs. Based on your questions I assume you are using some form of sitemap generation software. Apparently it is not configured correctly. You will need to take a look at these pages to determine why the URLs are invalid and/or contact the sitemap software vendor.
With respect to the indexing, submitting a sitemap is no guarantee that the pages will be indexed. You can submit a 1000 page site and have every page indexed, or you can have only a couple hundred pages indexed. There are a variety of factors involved.
Some factors which can affect indexing:
-
Is your robots.txt file blocking any of these pages?
-
Are any of these pages duplicate content?
-
Are any of the pages invalid URLs?
-
Are any of these pages canonicalized to other pages?
-
Are any of these pages 301'd to other pages?
-
How well is your site's navigation working? Sitemaps help Google find island pages and such, but your site will be crawled much better with proper navigation along with both internal and external links.
-
How popular is your site and these pages? Pages with good PA are crawled regularly and sites with high DA are crawled more frequently and deeper then other sites.
-
-
I'm just wondering how to do go about fixing these? I ses that they are not valid. Also once fixed do you think this will solve the sitmap issue? (like are these 8 not valid pages causing 600+ pages not being indexed) thanks.
-
The links in your reply are not valid. Try clicking on one of them. They are to your secure Google WMT page and they have an extra http:// prefix.
-
Errors look this ->
1916
Invalid URLThis is not a valid URL. Please correct it and resubmit.URL:http://exhibitions/info_22.htmlParent tag: urlTag: locProblem detected on: Aug 4, 2011 1919Invalid URLThis is not a valid URL. Please correct it and resubmit.URL:http://irish-myths-and-legends/info_12.htmlParent tag: urlTag: locProblem detected on: Aug 4, 2011There is about 10 errors like the above, any suggestions?
-
You need to click on the sitemap in Google WMT and it will inform you of the issue. There are many possible causes ranging from the sitemap link not being accessible to the file not being formatted correctly.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Switching from HTTP to HTTPS and google webmaster
HI, I've recently moved one of my sites www.thegoldregister.co.uk to https. I'm using wordpress and put in the permanent 301 redirect for all pages to false https for all pages in the htaaccess file. I've updated the settings in google analytics to https for the original site. All seems to be working well. Regarding the google webmaster tools and what needs to be done. I'm very confused by the google documentation on this subject around https. Does all my crawl data and indexing from http site still stand and be inherited by the https version because of the redirects in place. I'm really worried I will lose all of this indexing data, I looked at the "change of address" in the settings of webmaster, but this seems to refer to changing the actual domain name rather than the protocol which i haven't at all. I've also tried adding the https version to the console as well, but the https version is showing a severe warning "is robots.txt blocking some important pages". I don't understand this error as it's the same version and file as the http site being generated by all in one seo pack for wordpress (see below at bottom). The warning is against line 5 saying it will ignore it. What i don't understand is i don't get this error in the webmaster console with the http version which is the same file?? Any help and advice would be much appreciated. Kind regards Steve User-agent: *
Technical SEO | | lqz
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /xmlrpc.php
Crawl-delay: 10 ceLAHIv.jpg0 -
Do I have a robots.txt problem?
I have the little yellow exclamation point under my robots.txt fetch as you can see here- http://imgur.com/wuWdtvO This version shows no errors or warnings- http://imgur.com/uqbmbug Under the tester I can currently see the latest version. This site hasn't changed URLs recently, and we haven't made any changes to the robots.txt file for two years. This problem just started in the last month. Should I worry?
Technical SEO | | EcommerceSite0 -
Google Site Search
I'm considering to implement google site search bar into my site.
Technical SEO | | JonsonSwartz
I think I probably choose the version without the ads (I'll pay for it). does anyone use Google Site Search and can tell if it's a good thing? does it affects in any way on seo? thank you0 -
Google Disavow Tool
Some background: My rankings have been wildly fluctuating for the past few months for no apparent reason. When I inquired about this, many people said that even though I haven't received any penalty notice, I was probably affected by penguin. (http://moz.com/community/q/ranking-fluctuations) I recently did a link detox by LinkRemovalTools and it gave me a list of all my links, 2% were toxic and 51% were suspiscious. Should I simply disavow the 2%? There are many sites where is no contact info.
Technical SEO | | EcomLkwd0 -
A problem with duplicate content
I'm kind of new at this. My crawl anaylsis says that I have a problem with duplicate content. I set the site up so that web sections appear in a folder with an index page as a landing page for that section. The URL would look like: www.myweb.com/section/index.php The crawl analysis says that both that URL and its root: www.myweb.com/section/ have been indexed. So I appear to have a situation where the page has been indexed twice and is a duplicate of itself. What can I do to remedy this? And, what steps should i take to get the pages re-indexed so that this type of duplication is avoided? I hope this makes sense! Any help gratefully received. Iain
Technical SEO | | iain0 -
Is it best to create multiple xml sitemaps for different sections of a site?
I have a client with a very big site that includes a blog, videos, photo gallery, etc. Is it best to create a separate xml file for each of these sections? It seems to me like that would be the best way to keep everything organized. Or at least separate the blog out from the main site. Does anybody have any tips or recommendations? I'm not finding any good information about this.
Technical SEO | | MichaelWeisbaum0 -
Google Sitelinks
We have an e-commerce site that has about 50k pageviews of our main shop page every week. However in our Google sitelinks we have one for 'Shop'. However, for the Shop sitelink Google is linking to a random URL that we have never & would never use as a URL and not to our Shop page. I can't work out why Google would pick up this random url as we have so many links etc to the main shop page. Why are they not linking to the right page? I have blocked that url in webmaster tools and done a redirect but I want to understand why it happened in the first place. It included 'swedish+fish' so it seems weirdly spammy?! Any thoughts would be really helpful (and I am only mildly techy). Many thanks
Technical SEO | | ahamill0