Duplicate XML sitemaps - 404 or leave alone?
-
We switched over from our standard XML sitemap to a sitemap index. Our old sitemap was called sitemap.xml and the new one is sitemapindex.xml.
In Webmaster Tools it still shows the old sitemap.xml as valid. Also when you land on our sitemap.xml it will display the sitemap index, when really the index lives on sitemapindex.xml.
The reason you can see the sitemap on both URLs is because this is set from the sitemap plugin. So the question is, should we change the plugin setting to let the old sitemap.xml 404, or should we allow the new sitemap index to be accessed on both URLs?
-
If webmaster tools likes the old one then I wouldn't rock the boat. I don't think you are going to have any problems with having 2 site maps. But I've never toyed with this one.
-
It makes no difference.
The only ones who access your XML sitemap are web crawlers. Web crawlers become aware of your sitemap location by three methods:
-
you notify the crawler such as in Google WMT
-
you notify the crawler with a path provided in your robots.txt file
-
you notify the crawler by pinging them with your sitemap information
-
if I was to add a 4th method, crawlers can guess /sitemap.xml as a default path
As long as you have the a valid location set up in WMT (both Google and Bing), and you do not offer the alternate file name in your robots.txt or elsewhere, no one else will even know the sitemapindex.xml file exists.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap Rules
Hello there, I have some questions pertaining to sitemaps that I would appreciate some guidance on. 1. Can an XML sitemap contain URLs that are blocked by robots.txt? Logically, it makes sense to me to not include pages blocked by robots.txt but would like some clarity on the matter i.e. will having pages blocked by robots.txt in a sitemap, negatively impact the benefit of a sitemap? 2. Can a XML sitemap include URLs from multiple subdomains? For example: http://www.example.com/www-sitemap.xml would include the home page URL of two other subdomains i.e. http://blog.example.com/ & http://blog2.example.com/ Thanks
Technical SEO | | SEONOW1230 -
Duplicate Title Tags How harmful is it?
I recently updated a site to use a new WordPress theme. This theme in conjunction with Yoast SEO plugin is causing duplicate Title tags. The tags have the same title in them. I discovered this when adding new keywords and pages to the page optimizer in moz. I have since turned off moz to stop the duplicate title tag issue however I am wondering if tuning off yoast is worse then the duplicate title tags. Any clarification on the duplicate page title issue and its consequences would be greatly appreciated.
Technical SEO | | donsilvernail0 -
XML Sitemap Generators
I am looking to use a different sitemap generator that can do 5 thousand or more pages at once. Any recommendations? Thanks guys.
Technical SEO | | Chenzo0 -
Duplicate content or titles
Hello , I am working on a site, I am facing the duplicate title and content errors,
Technical SEO | | KLLC
there are following kind of errors : 1- A link with www and without www having same content. actually its a apartment management site, so it has different bedrooms apartments and booking pages , 2- my second issue is related to booking and details pages of bedrooms, because I am using 1 file for all booking and 1 file for all details page. these are the main errors which i am facing ,
can anyone give me suggestions regarding these issues ? Thnaks,0 -
How to protect against duplicate content?
I just discovered that my company's 'dev website' (which mirrors our actual website, but which is where we add content before we put new content to our actual website) is being indexed by Google. My first thought is that I should add a rel=canonical tag to the actual website, so that Google knows that this duplicate content from the dev site is to be ignored. Is that the right move? Are there other things I should do? Thanks!
Technical SEO | | williammarlow0 -
Duplicate Content
The crawl shows a lot of duplicate content on my site. Most of the urls its showing are categories and tags (wordpress). so what does this mean exactly? categories is too much like other categories? And how do i go about fixing this the best way. thanks
Technical SEO | | vansy0 -
Duplicate content
I'm getting an error showing that two separate pages have duplicate content. The pages are: | Help System: Domain Registration Agreement - Registrar Register4Less, Inc. http://register4less.com/faq/cache/11.html 1 27 1 Help System: Domain Registration Agreement - Register4Less Reseller (Tucows) http://register4less.com/faq/cache/7.html | These are both registration agreements, one for us (Register4Less, Inc.) as the registrar, and one for Tucows as the registrar. The pages are largely the same, but are in fact different. Is there a way to flag these pages as not being duplicate content? Thanks, Doug.
Technical SEO | | R4L0 -
Duplicate Page Titles
I had an issue where I was getting duplicate page titles for my index file. The following URLs were being viewed as duplicates: www.calusacrossinganimalhospital.com www.calusacrossinganimalhospital.com/index.html www.calusacrossinganimalhospital.com/ I tried many solutions, and came across the rel="canonical". So i placed the the following in my index.html: I did a crawl, and it seemed to correct the duplicate content. Now I have a new message, and just want to verify if this is bad for search engines, or if it is normal. Please view the attached image. i9G89.png
Technical SEO | | pixel830