Image Sitemap Indexing Issue
-
Hello Folks,
I've been running into some strange issues with our XML Sitemaps.
- The XML Sitemaps won't open on a browser and it throws the following error instead of opening the XML Sitemap. Sample XML Sitemap - www.veer.com/sitemap/images/Sitemap0.xml.gzError - "XML Parsing Error: no element foundLocation: http://www.veer.com/sitemap/images/Sitemap0.xmlLine Number 1, Column 1:"2) Image files are not getting indexed. For instance, the sitemap - www.veer.com/sitemap/images/Sitemap0.xml.gz has 6,000 URLs and 6,000 Images. However, only 3,481 URLs and 25 images are getting indexed. The sitemap formatting seems good, but I can't figure out why Google's de-indexing the images and only 50-60% of the URLs are getting indexed. Thank you for your help!
-
Hi Cyrus,
Thank you for your note and my apologies for delay in response.
The indexation number is from Google Webmaster Tools.
The two are identical and I've tested other XML sitemap files that are in GZ format that opened fine in the browser without unzipping them or prompting a DL. The sitemaps were uploaded to GWT as the .gz files only since we have many pages to upload.
I'll check with our Dev Team regarding the XML parsing error.
Please let me know what other areas we need to look into based my answers to your questions. Thank you for your help, I greatly appreciate it!
-
Some possible suggestions:
- Make sure every image has a width and height attribute defined in the HTML. Images are much more likely to be indexed this way.
- Same with the "alt" attribute
- Make sure your image subdirectory isn't blocked (robots.txt for example)
- Same with the pages
It may be Google actually is indexing those images, but not reporting them in GWT. Do an image search and narrow results to your site, to see if your images actually appear.
Aside from accessibility issues, make sure the images are on well-linked to pages. It's much more likely for an image to be indexed on a page with good link metrics and a lack of crawl problems.
-
@Cyrus
You have given very good explanation. But, I have similar issue for image sitemap. If we are talking about crawling & indexing ratio so, it's quite good. You can know more by attachment.
You can check syntax of image sitemap by following XML.
http://www.vistastores.com/patio_umbrellas_sitemap.xml
Can you give me input ::: How can I improve crawling and indexing for images?
-
Hi Corbis,
Man, you've got some tough questions! i may have to call in some outside support on this one if we can't figure it out.
First of all, are you getting the indexation #s from Google Webmaster Tools? What I mean by this - is Google saying there are 6000 URLs in your sitemap, but they are only indexing 3,481?
When I unzipped the compressed sitemap file, it opened fine in my browser, while the 2nd uncompressed file did not. Are they identical? And have you submitted both to Google?
There could be many reasons why you're getting the XML parsing error. One issue might be in the second line, referencing http://www.google.com/schemas/sitemap-image/1.1/ as a Schema location, because this is an html webpage and not an XML or DTD file. You might try removing the reference to this URL, and see if that helps.
Otherwise, if Google is reporting the correct number of URLs and Images, then you know they are aware of those URLs, and the problem may not be with the sitemap. Google doesn't necessarily index all URLs in a sitemap, but instead bases it's indexing on factors like your domain authority, link structure and crawl allowance. Addressing these issues will usually help get more pages indexed than a sitemap alone.
So if you can improve internal crawl errors, duplicate content issues, and make sure there is a good navigational architecture to your site, you should see a good rise in indexations.
-
Hi Folks,
Just following up on this query. Any insights? Thank you for your help!
-Corbis
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Video sitemap
Hello, I'm no Wordpress developer so need a little help please. I have manually created a video sitemap. It needs to be uploaded to the website. Where should the .xml file be uploaded onto Wordpress? Which directory? Is it Ok to add the code to a notepad file and upload? I'm trying to avoid the plugin route if possible. Thanks
Technical SEO | | AL123al0 -
Upgrade old sitemap to a new sitemap index. How to do without danger ?
Hi MOZ users and friends. I have a website that have a php template developed by ourselves, and a wordpress blog in /blog/ subdirectory. Actually we have a sitemap.xml file in the root domain where are all the subsections and blog's posts. We upgrade manually the sitemap, once a month, adding the new posts created in the blog. I want to automate this process , so i created a sitemap index with two sitemaps inside it. One is the old sitemap without the blog's posts and a new one created with "Google XML Sitemap" wordpress plugin, inside the /blog/ subdirectory. That is, in the sitemap_index.xml file i have: Domain.com/sitemap.xml (old sitemap after remove blog posts urls) Domain.com/blog/sitemap.xml (auto-updatable sitemap create with Google XML plugin) Now i have to submit this sitemap index to Google Search Console, but i want to be completely sure about how to do this. I think that the only that i have to do is delete the old sitemap on Search Console and upload the new sitemap index, is it ok ?
Technical SEO | | ClaudioHeilborn0 -
Sitemap url's not being indexed
There is an issue on one of our sites regarding many of the sitemap url's not being indexed. (at least 70% is not being indexed) The url's in the sitemap are normal url's without any strange characters attached to them, but after looking into it, it seems a lot of the url's get a #. + a number sequence attached to them once you actually go to that url. We are not sure if the "addthis" bookmark could cause this, or if it's another script doing it. For example Url in the sitemap: http://example.com/example-category/0246 Url once you actually go to that link: http://example.com/example-category/0246#.VR5a Just for further information, the XML file does not have any style information associated with it and is in it's most basic form. Has anyone had similar issues with their sitemap not being indexed properly ?...Could this be the cause of many of these url's not being indexed ? Thanks all for your help.
Technical SEO | | GreenStone0 -
Anything new if determining how many of a sites pages are in Google's supplemental index vs the main index?
Since site:mysite.com *** -sljktf stopped working to find pages in the supplemental index several years ago has anyone found another way to identify content that has been regulated to the supplemental index?
Technical SEO | | SEMPassion0 -
Duplicate page issue
Hi, i have a serious duplicate page issue and not sure how it happened and i am not sure if anyone will be able to help as my site was built in joomla, it has been done through k2, i have never come across this issue before i am seem to have lots of duplicate pages under author names, example http://www.in2town.co.uk/blog/diane-walker this page is showing the full articles which is not great for seo and it is also showing that there are hundreds more articles at the bottom on the semoz tool i am using, it is showing these as duplicates although there are hundreds of them and it is causing google to see lots of duplicate pages. Diane Walker
Technical SEO | | ClaireH-184886
http://www.in2town.co.uk/blog/diane-walker/Page-2 5 1 0
Diane Walker
http://www.in2town.co.uk/blog/diane-walker/Page-210 1 1 0
Diane Walker
http://www.in2town.co.uk/blog/diane-walker/Page-297 1 1 0
Diane Walker
http://www.in2town.co.uk/blog/diane-walker/Page-3 5 1 0
Diane Walker can anyone please help me to sort this important issue out.0 -
HTML Sitemap Pagination?
Im creating an a to z type directory of internal pages within a site of mine however there are cases where there are over 500 links within the pages. I intend to use pagination (rel=next/prev) to avoid too many links on the page but am worried about indexation issues. should I be worried?"
Technical SEO | | DMGoo0 -
Duplicate content issue index.html vs non index.html
Hi I have an issue. In my client's profile, I found that the "index.html" are mostly authoritative than non "index.html", and I found that www. version is more authoritative than non www. The problem is that I find the opposite situation where non "index.html" are more authoritative than "index.html" or non www more authoritative than www. My logic would tell me to still redirect the non"index.html" to "index.html". Am I right? and in the case I find the opposite happening, does it matter if I still redirect the non"index.html" to "index.html"? The same question for www vs non www versions? Thank you
Technical SEO | | Ideas-Money-Art0 -
Https indexed - though a no index no follow tag has been added
Hi, The https-pages of our booking section are being indexed by Google. We added But the pages are still being indexed. What can I do to exclude these URL's from the Google index? Thank you very much in advance! Kind regards, Dennis Overbeek ACSI Publishing | dennis@acsi.eu
Technical SEO | | SEO_ACSI0