I have two sitemaps which partly duplicate - one is blocked by robots.txt but can't figure out why!
-
Hi, I've just found two sitemaps - one of them is .php and represents part of the site structure on the website. The second is a .txt file which lists every page on the website. The .txt file is blocked via robots exclusion protocol (which doesn't appear to be very logical as it's the only full sitemap). Any ideas why a developer might have done that?
-
There are standards for the sitemaps .txt and .xml sitemaps, where there are no standards for html varieties. Neither guarantees the listed pages will be crawled, though. HTML has some advantage of potentially passing pagerank, where .txt and .xml varieties don't.
These days, xml sitemaps may be more common than .txt sitemaps but both perform the same function.
-
yes, sitemap.txt is blocked for some strange reason. I know SEOs do this sometimes for various reasons, but in this case it just doesn't make sense - not to me, anyway.
-
Thanks for the useful feedback Chris - much appreciated - Is it good practice to use both - I guess it's a good idea if onsite version only includes top-level pages? PS. Just checking nature of block!
-
Luke,
The .php one would have been created as a navigation tool to help users find what they're looking for faster, as well as to provide html links to search engine spiders to help them reach all pages on the site. On small sites, such sitemaps often include all pages of the site, on large ones, it might just be high level pages. The .txt file is non html and exists to provide search engines with a full list of urls on the site for the sole purpose of helping search engines index all the site's pages.
The robots.txt file can also be used to specify the location of the sitemap.txt file such as
sitemap: http://www.example.com/sitemap_location.txt
Are you sure the sitemap is being blocked by the robots.txt file or is the robots.txt file just listing the location of the sitemap.txt?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can subdomains hurt your primary domain's SEO?
Our primary website https://domain.com has a subdomain https://subDomain.domain.com and on that subdomain we have a jive-hosted community, with a few links to and fro. In GA they are set up as different properties but there are many SEO issues in the jive-hosted site, in which many different people can create content, delete content, comment, etc. There are issues related to how jive structures content, broken links, etc. My question is this: Aside from the SEO issues with the subdomain, can the performance of that subdomain negatively impact the SEO performance and rank of the primary domain? I've heard and read conflicting reports about this and it would be nice to hear from the MOZ community about options to resolve such issues if they exist. Thanks.
Intermediate & Advanced SEO | | BHeffernan1 -
Organic listings disappeared I don't know why!
Brief history: I am MD of a medium sized health organisation in the UK. We have one of the leading websites in the world for our industry. We were hit by a Google algorithm update last year (Penguin or Panda, I can't remember, but that's not relevant here I don't think) and our daily visits went down from around 10,000 to around 5,000 in two separate hits over a couple of months. Then there was a steady decrease to about 3,000-4,000 visits a day until we totally updated the design of the site and did some good work on the content. We have always been white-hat and the site has around 3,000 pages with unique content added daily. So things have really been on the up for the past couple of months. We have been receiving around 6,000 visits a day in recent weeks (a slow incline over the past few months), until Sunday. Sunday morning around 10am all of our organic listings pretty much disappear, including for our brand name. Monday morning a few come back, including our brand name and our main, most competitive keyword, which we were showing up on the third page for and we returned to this page. Then Tuesday morning another few of our most competitive keywords show up, back where they were before. This includes images which had disappeared from Google images. Our PPC and business listings were not really affected at all. My developer submitted a site map through webmaster tools on Monday morning and I'm not sure if this is the reason pages started to show up again. In our Webmaster tools the indexed pages are about a quarter of all of the ones on the site - all pages were indexed before. I just don't know what has happened! It doesn't make any sense as 1. Google don't seem to have rolled out any algorithm updates on that day 2. we do not have any messages in Webmaster Tools 3. a number of our main keywords have re-appeared - why would that happen if we had been hit by a Google update?! Our organic hits, which previously made up about 80% of all our hits, have gone down by 80% and this is drastically affecting business. If this continues it is likely we will have to downsize the business and I'm not sure what to do. When I saw that the 'indexed pages' in Webmaster tools started to increase (they were around 600 on Monday, around 900 yesterday and then this morning, around 1,300), I thought that we were on our way up and maybe this problem would just resolve itself and our listings would re-appear, but now our indexed pages have reduced slightly since this morning, back down to around 1,100 so the increase has stalled. Can anybody help?! Do you have any idea what could be causing this? Apparently there have been no changes made to robots.txt and my developer says that no changes were made that could have affected our listings. ANY ADVICE WOULD BE GREATLY APPRECIATED.
Intermediate & Advanced SEO | | JH11 -
What to do about old urls that don't logically 301 redirect to current site?
Mozzers, I have changed my site url structure several times. As a result, I now have a lot of old URLs that don't really logically redirect to anything in the current site. I started out 404-ing them, but it seemed like Google was penalizing my crawl rate AND it wasn't removing them from the index after being crawled several times. There are way too many (>100k) to use the URL removal tool even at a directory level. So instead I took some advice and changed them to 200, but with a "noindex" meta tag and set them to not render any content. I get less errors but I now have a lot of pages that do this. Should I (a) just 404 them and wait for Google to remove (b) keep the 200, noindex or (c) are there other things I can do? 410 maybe? Thanks!
Intermediate & Advanced SEO | | jcgoodrich0 -
If I only Link to Page via Sitemap, can it still get indexed?
Hi there! I am creating a ton of content for specific geographies. Is it possible for these pages to get indexed if I only put them in my sitemap and don't link to them through my actual site (though the pages will be live). Thanks!
Intermediate & Advanced SEO | | Travis-W
Travis0 -
Web pages fighting over rank for one keyword. Can it be stopped?
Hey, See attachment. Website is Omega Red. The page I want to rank for seems like it is being held back by other closely related pages with similar titles. I am looking to rank for electrical earthing with this page. On the graph it shows how the other pages have interacted over a period of time on the website and how if they drop out of the top 50 this page then moves up in Google. I don't really want to canonicalise the other pages into one but maybe this is what needs to happen? Any suggestions? bWymgVt.jpg
Intermediate & Advanced SEO | | Hughescov0 -
Site has no SEO done on it. It wasn't considered during design. What to do first ?
They opted for videos to explain to people what the website is about, but it ain't working for them. What steps would you take in order to get this site to rank higher without completely changing the design(changing design is out of the question they are low on funds). They also built a blog on wordpress.com and added a .me domain to it. For obvious reasons I'm not mentioning the website.
Intermediate & Advanced SEO | | ternit0 -
Preferred domain can't set in Web master Tool
I have put my domain name as xxxxxtours.com without www in web master tool. i have redirect to www version using htaccess file .So I wanna put Preferred domain "Display urls as www.xxxxtours.com .When trying it give error as attached image.but i have verified site the .waiting for expert help . Ar5qx.png
Intermediate & Advanced SEO | | innofidelity0 -
.com Outranking my ccTLD's and cannot figure out why.
So I have a client that has a number of sites for a number of different countries with their specific ccTLD. They also have a .com in the US. The problem is that the UK site hardly ranks for anything while the .com ranks for a ton in the UK. I have setup GWT for the UK and the .com to be specific to their geographic locations. So I have the ccTLD and I have GWT showing where I want these sites to rank. Problem is it apparently is not working....Any clues as to what else I could do?
Intermediate & Advanced SEO | | DRSearchEngOpt0