Duplicate XML sitemaps - 404 or leave alone?
-
We switched over from our standard XML sitemap to a sitemap index. Our old sitemap was called sitemap.xml and the new one is sitemapindex.xml.
In Webmaster Tools it still shows the old sitemap.xml as valid. Also when you land on our sitemap.xml it will display the sitemap index, when really the index lives on sitemapindex.xml.
The reason you can see the sitemap on both URLs is because this is set from the sitemap plugin. So the question is, should we change the plugin setting to let the old sitemap.xml 404, or should we allow the new sitemap index to be accessed on both URLs?
-
If webmaster tools likes the old one then I wouldn't rock the boat. I don't think you are going to have any problems with having 2 site maps. But I've never toyed with this one.
-
It makes no difference.
The only ones who access your XML sitemap are web crawlers. Web crawlers become aware of your sitemap location by three methods:
-
you notify the crawler such as in Google WMT
-
you notify the crawler with a path provided in your robots.txt file
-
you notify the crawler by pinging them with your sitemap information
-
if I was to add a 4th method, crawlers can guess /sitemap.xml as a default path
As long as you have the a valid location set up in WMT (both Google and Bing), and you do not offer the alternate file name in your robots.txt or elsewhere, no one else will even know the sitemapindex.xml file exists.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
?_escaped_fragment_= Duplicate error in Webmaster
Hi I am not sure where this came from ... ?escaped_fragment= But in webmaster we are seeing hundreds of pages with this and thus webmaster is saying that we have Pages with duplicate title tags How do I fix this, or remove it. Regards T
Technical SEO | | Taiger0 -
Duplicate Content of Reseller Product?
There is a particular product/service that I resell through an API. There are quite a few of them and each one requires a lot of content. The company provides web content for each product but I'm wondering about the SEO implications of using it? Obviously using the content, it will not be unique so I won't be able to rank (easily at least) for these products. Are there any _negative_results that I can get from using this content though? If I simply won't rank for those products it's not an issue since I get traffic elsewhere. Thanks!
Technical SEO | | reliabox0 -
Remove 404 errors
I've got a site (www.dikelli.com.au) that has some 404 errors. I'm using Dreamweaver to manage the site which was built for me by I can't seem to figure out how to remove the 404 pages as it's not showing up in the directory? How would I fix this up?
Technical SEO | | sterls0 -
Duplicate Content
Hi, we need some help on resolving this duplicate content issue,. We have redirected both domains to this magento website. I guess now Google considered this as duplicate content. Our client wants both domain name to go to the same magento store. What is the safe way of letting Google know these are same company? Or this is not ideal to do this? thanks
Technical SEO | | solution.advisor0 -
Mass 404 Checker?
Hi all, I'm currently looking after a collection of old newspaper sites that have had various developments during their time. The problem is there are so many 404 pages all over the place and the sites are bleeding link juice everywhere so I'm looking for a tool where I can check a lot of URLs at once. For example from an OSE report I have done a random sampling of the target URLs and some of them 404 (eek!) but there are too many to check manually to know which ones are still live and which ones have 404'd or are redirecting. Is there a tool anyone uses for this or a way one of the SEOMoz tools can do this? Also I've asked a few people personally how to check this and they've suggested Xenu, Xenu won't work as it only checks current site navigation. Thanks in advance!
Technical SEO | | thisisOllie0 -
Duplicate Content - Mobile Site
We think that a mobile version of our site is causing a duplicate content issue; what's the best way to stop the mobile version being indexed. Basically the site forwards mobile users to "/mobile" which is just a mobile optimised version of the original site. Is it best to block the /mobile folder from being crawled?
Technical SEO | | nsmith7870 -
Google WMT shows sitemap.xml highest ranked for one main keyword
Hello, I am seeing my sitemap.xml show up in Google webmaster tools at the top for one of the main keywords for my site. This is in the Your Site on the Web - Keywords section. The URLs of my site contain this keyword, which is why I figure it showed up. I'm curious if this should be a concern to me? I find it odd that the sitemap would show up in this way. Thanks
Technical SEO | | nux0 -
Dealing with 404 pages
I built a blog on my root domain while I worked on another part of the site at .....co.uk/alpha I was really careful not to have any links go to alpha - but it seems google found and indexed it. The problem is that part of alpha was a copy of the blog - so now soon we have a lot of duplicate content. The /alpha part is now ready to be taken over to the root domain, the initial plan was to then delete /alpha. But now that its indexed I'm worried that Ill have all these 404 pages. I'm not sure what to do.. I know I can just do a 301 redirect for all those pages to go to the other ones in case a link comes on but I need to delete those pages as the server is already very slow. Or does a 301 redirect mean that I don't need those pages anymore? Will those pages still get indexed by google as separate pages? Please assist.
Technical SEO | | borderbound0