Duplicate XML sitemaps - 404 or leave alone?
-
We switched over from our standard XML sitemap to a sitemap index. Our old sitemap was called sitemap.xml and the new one is sitemapindex.xml.
In Webmaster Tools it still shows the old sitemap.xml as valid. Also when you land on our sitemap.xml it will display the sitemap index, when really the index lives on sitemapindex.xml.
The reason you can see the sitemap on both URLs is because this is set from the sitemap plugin. So the question is, should we change the plugin setting to let the old sitemap.xml 404, or should we allow the new sitemap index to be accessed on both URLs?
-
If webmaster tools likes the old one then I wouldn't rock the boat. I don't think you are going to have any problems with having 2 site maps. But I've never toyed with this one.
-
It makes no difference.
The only ones who access your XML sitemap are web crawlers. Web crawlers become aware of your sitemap location by three methods:
-
you notify the crawler such as in Google WMT
-
you notify the crawler with a path provided in your robots.txt file
-
you notify the crawler by pinging them with your sitemap information
-
if I was to add a 4th method, crawlers can guess /sitemap.xml as a default path
As long as you have the a valid location set up in WMT (both Google and Bing), and you do not offer the alternate file name in your robots.txt or elsewhere, no one else will even know the sitemapindex.xml file exists.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content and Subdirectories
Hi there and thank you in advance for your help! I'm seeking guidance on how to structure a resources directory (white papers, webinars, etc.) while avoiding duplicate content penalties. If you go to /resources on our site, there is filter function. If you filter for webinars, the URL becomes /resources/?type=webinar We didn't want that dynamic URL to be the primary URL for webinars, so we created a new page with the URL /resources/webinar that lists all of our webinars and includes a featured webinar up top. However, the same webinar titles now appear on the /resources page and the /resources/webinar page. Will that cause duplicate content issues? P.S. Not sure if it matters, but we also changed the URLs for the individual resource pages to include the resource type. For example, one of our webinar URLs is /resources/webinar/forecasting-your-revenue Thank you!
Technical SEO | | SAIM_Marketing0 -
Duplicate Content
HI There, Hoping someone can help me - before i damage my desk banging my head. Getting notifications from ahrefs and Moz for duplicate content. I have no idea where these weird urls have came from , but they do take us to the correct page (but it seems a duplicate of this page). correct url http://www.acsilver.co.uk/shop/pc/Antique-Vintage-Rings-c152.htm Incorrect url http://www.acsilver.co.uk/shop/pc/vintage-Vintage-Rings- c152.htm This is showing for most of our store categories 😞 Desperate for help as to what could be causing these issues. I have a technical member of the ecommerce software go through the large sitemap files and they assured me it wasn't linked to the sitemap files. Gemma
Technical SEO | | acsilver0 -
Duplicate Content Problems
Hi I am new to the seomoz community I have been browsing for a while now. I put my new website into the seomoz dashboard and out of 250 crawls I have 120 errors! So the main problem is duplicate content. We are a website that finds free content sources for popular songs/artists. While seo is not our main focus for driving traffic I wanted to spend a little time to make sure our site is up to standards. With that said you can see when two songs by an artist are loaded. http://viromusic.com/song/125642 & http://viromusic.com/song/5433265 seomoz is saying that it is duplicate content even though they are two completely different songs. I am not exactly sure what to do about this situation. We will be adding more content to our site such as a blog, artist biographies and commenting maybe this will help? Although if someone was playing multiple bob marley songs the biography that is loaded will also be the same for both songs. Also when a playlist is loaded http://viromusic.com/playlist/sldvjg on the larger playlists im getting an error for to many links on the page. (some of the playlists have over 100 songs) any suggestions? Thanks in advance and any tips or suggestions for my new site would be greatly appreciated!
Technical SEO | | mikecrib10 -
Duplicate page titles
Hi, I have a Joomla 2.5 site and I use categoryblogs. So I have a page with "reviews". All the reviews are shown on this page and there are about 15 pages of it. In my SEOMoz crawl result I get 71 errors ! about "duplicate titles". How can I diminish this? I don't know how to show all the reviews in a proper way other than what I have accomplished with categoryblog. Patrick
Technical SEO | | paddydaddy0 -
Removel of duplicate contant
Do to the WordPress programming I'm having a lot of duplicates that I will remove soon. What is the best way to make a decision which ones to keep and which ones to remove?
Technical SEO | | Joseph-Green-SEO0 -
Robots.txt versus sitemap
Hi everyone, Lets say we have a robots.txt that disallows specific folders on our website, but a sitemap submitted in Google Webmaster Tools that lists content in those folders. Who wins? Will the sitemap content get indexed even if it's blocked by robots.txt? I know content that is blocked by robot.txt can still get indexed and display a URL if Google discovers it via a link so I'm wondering if that would happen in this scenario too. Thanks!
Technical SEO | | anthematic0 -
404 Error on Spider Emulators
I recently began working at a company called Uncommon Goods. I ran a few different spider emulators on our homepage (uncommongoods.com) and I saw a 404 Error on SEO-browser.com as well as URL errors on Summit Media's emulator and SEOMoz's crawler. It seems there is a serious problem here. How is this affecting our site from an SEO standpoint? What are the repercussions? Also, I know we have a lot of javascript on our homepage..is this causing the 404? Any advice would be much appreciated. Thanks! -Zack
Technical SEO | | znotes0 -
Segmenting Website into XML Sitemaps
Hi all, I'm about to begin the process of chopping up a 1,000 page website into separate sitemaps. I'm going for a three tiered approach so that I can check indexation on each level for: Category, Subcategory, Product What's the easiest way to create three separate XML sitemaps for this? Thanks, Nick
Technical SEO | | NickPateman810