Duplicate XML sitemaps - 404 or leave alone?
-
We switched over from our standard XML sitemap to a sitemap index. Our old sitemap was called sitemap.xml and the new one is sitemapindex.xml.
In Webmaster Tools it still shows the old sitemap.xml as valid. Also when you land on our sitemap.xml it will display the sitemap index, when really the index lives on sitemapindex.xml.
The reason you can see the sitemap on both URLs is because this is set from the sitemap plugin. So the question is, should we change the plugin setting to let the old sitemap.xml 404, or should we allow the new sitemap index to be accessed on both URLs?
-
If webmaster tools likes the old one then I wouldn't rock the boat. I don't think you are going to have any problems with having 2 site maps. But I've never toyed with this one.
-
It makes no difference.
The only ones who access your XML sitemap are web crawlers. Web crawlers become aware of your sitemap location by three methods:
-
you notify the crawler such as in Google WMT
-
you notify the crawler with a path provided in your robots.txt file
-
you notify the crawler by pinging them with your sitemap information
-
if I was to add a 4th method, crawlers can guess /sitemap.xml as a default path
As long as you have the a valid location set up in WMT (both Google and Bing), and you do not offer the alternate file name in your robots.txt or elsewhere, no one else will even know the sitemapindex.xml file exists.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How does changing sitemaps affect SEO
Hi all, I have a question regarding changing the size of my sitemaps. Currently I generate sitemaps in batches of 50k. A situation has come up where I need to change that size to 15k in order to be crawled by one of our licensed services. I haven't been able to find any documentation on whether or not changing the size of my sitemaps(but not the pages included in them) will affect my rankings negatively or my SEO efforts in general. If anyone has any insights or has experienced this with their site please let me know!
Technical SEO | | Jason-Reid0 -
If I'm using a compressed sitemap (sitemap.xml.gz) that's the URL that gets submitted to webmaster tools, correct?
I just want to verify that if a compressed sitemap file is being used, then the URL that gets submitted to Google, Bing, etc and the URL that's used in the robots.txt indicates that it's a compressed file. For example, "sitemap.xml.gz" -- thanks!
Technical SEO | | jgresalfi0 -
Would a Search Engine treat a sitemap hosted in the cloud in the same way as if it was simply on /sitemap.htm?
Mainly to allow updates without the need for publishing - would Google interpret any differently? Thanks
Technical SEO | | RichCMF0 -
Is this duplicate content that I should be worried about?
Our product descriptions appear in two places and on one page they appear twice. The best way to illustrate that would be to link you to a search results page that features one product. My duplicate content concern refers to the following, When the customer clicks the product a pop-up is displayed that features the product description (first showing of content) When the customer clicks the 'VIEW PRODUCT' button the product description is shown below the buy buytton (second showing of content), this is to do with the template of the page and is why it is also shown in the pop-up. This product description is then also repeated further down in the tabs (third showing of content). My thoughts are that point 1 doesn't matter as the content isn't being shown from a dedicated URL and it relies on javascript. With regards to point 2, is the fact the same paragraph appears on the page twice a massive issue and a duplicate content problem? Thanks
Technical SEO | | joe-ainswoth0 -
404 from a 404 that 301s
I must be missing something or skipping a step or lacking proper levels of caffeine. Under my High Priority warnings I have a handful of 404s which are like that on purpose but I'm not sure how Moz is finding them. When I check the referrer info, the 404 is being linked to from a different 404 which is now a 301 (due to craziness of our system and what was easiest for the coders to fix a different problem ages ago). Basically, if a user decides to type in a non-existent model number into the URL there is a specific 404 that comes up. While the 404 error is "site.com/product/?model=abc123" the referrer is "site.com/product?model=abc123" (or more simply, one slash is missing). I can't see how Moz is finding the referrer so I can't figure out how to make Moz stop crawling it. I actually have the same problem in Google WMT for the same group of 404s. What am I just not seeing that will fix this?
Technical SEO | | MikeRoberts0 -
Duplicate Content Issue
SEOMOZ is giving me a number of duplicate content warnings related to pages that have an email a friend and/or email when back in stock versions of a page. I thought I had those blocked via my robots.txt file which contains the following... Disallow: /EmailaFriend.asp Disallow: /Email_Me_When_Back_In_Stock.asp I had thought that the robot.txt file would solve this issue. Anyone have any ideas?
Technical SEO | | WaterSkis.com0 -
Google indexing less url's then containded in my sitemap.xml
My sitemap.xml contains 3821 urls but Google (webmaster tools) indexes only 1544 urls. What may be the cause? There is no technical problem. Why does Google index less URLs then contained in my sitemap.xml?
Technical SEO | | Juist0 -
Crawl Errors and Duplicate Content
SEOmoz's crawl tool is telling me that I have duplicate content at "www.mydomain.com/pricing" and at "www.mydomain.com/pricing.aspx". Do you think this is just a glitch in the crawl tool (because obviously these two URL's are the same page rather than two separate ones) or do you think this is actually an error I need to worry about? Is so, how do I fix it?
Technical SEO | | MyNet0