Search Console rejecting XML sitemap files as HTML files, despite them being XML
-
Hi Moz folks,
We have launched an international site that uses subdirectories for regions and have had trouble getting pages outside of USA and Canada indexed.
Google Search Console accounts have finally been verified, so we can submit the correct regional sitemap to the relevant search console account.
However, when submitting non-USA and CA sitemap files (e.g. AU, NZ, UK), we are receiving a submission error that states, "Your Sitemap appears to be an HTML page," despite them being .xml files, e.g. http://www.t2tea.com/en/au/sitemap1_en_AU.xml.
Queries on this suggest it's a W3 Cache plugin problem, but we aren't using Wordpress; the site is running on Demandware.
Can anyone guide us on why Google Search Console is rejecting these sitemap files? Page indexation is a real issue.
Many thanks in advance!
-
Thanks, both. We'll explore a better solution with Demandware.
-
agree
-
Quite sure that's the case. When I'm following the URL the site also redirects me to a normal page. What is likely is that the same thing is happening to the bots of Google.
-
Extra thought: We're wondering if it's a bigger issue involving the redirect mechanic? Currently, users from a specific country are automatically redirected to their respective locale (e.g. US users trying to access Australian URLs are redirected to /en/us/). Is there something in this where Googlebots aren't able to access AU, NZ and UK subdirectories and sitemap files because they're coming from North America?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search Console - Mobile Usability Errors
A site I'm looking at for a client had 100's of pages flagged as having Mobile Usability errors in Search Console. I found that the theme uses parameters in the URLs of some of theme resources (.js/.css) to identify the version strings. These were then being blocked by a rule in the robots.txt: "Disallow: /*?" I've removed this rule, and now when I inspect URLs and test the live versions of the page they are now being reported as mobile friendly. I then submitted validation requests in Search Console for both of the errors ("Text to small" and "Clickable Elements too close") My problem now, is that the validation has completed and the pages are still being reported as having the errors. I've double checked and they're find if I inspect them individually. Does anyone else have experience clearing these issues in Search Console? Any ideas what's going on here!
Technical SEO | | DougRoberts1 -
Google Search console says 'sitemap is blocked by robots?
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt." I don't understand why my sitemap is being blocked? My robots.txt look like this: User-Agent: *
Technical SEO | | Extima-Christian
Disallow: Sitemap: http://www.website.com/sitemap_index.xml It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?1 -
Desktop & Mobile XML Sitemap Submitted But Only Desktop Sitemap Indexed On Google Search Console
Hi! The Problem We have submitted to GSC a sitemap index. Within that index there are 4 XML Sitemaps. Including one for the desktop site and one for the mobile site. The desktop sitemap has 3300 URLs, of which Google has indexed (according to GSC) 3,000 (approx). The mobile sitemap has 1,000 URLs of which Google has indexed 74 of them. The pages are crawlable, the site structure is logical. And performing a Landing Page URL search (showing only Google/Organic source/medium) on Google Analytics I can see that hundreds of those mobile URLs are being landed on. A search on mobile for a longtail keyword from a (randomly selected) page shows a result in the SERPs for the mobile page that judging by GSC has not been indexed. Could this be because we have recently added rel=alternate tags on our desktop pages (and of course corresponding canonical ones on mobile). Would Google then 'not index' rel=alternate page versions? Thanks for any input on this one. PmHmG
Technical SEO | | AlisonMills0 -
Adding your sitemap to robots.txt
Hi everyone, Best practice question: When adding your sitemap to your robots.txt file, do you add the whole sitemap at once or do you add different subcategories (products, posts, categories,..) separately? I'm very curious to hear your thoughts!
Technical SEO | | WeAreDigital_BE0 -
Some competitors have a thumbnail in Google search results
I've noticed that a few of my top competitors have a small photo (thumbnail) next to their listing. I'm sure it's not a coincidence that they are ranked top for the search phrase too. Is this really a help and how can it be done? Many thanks, Iain.
Technical SEO | | iainmoran0 -
Disallow: /search/ in robots but soft 404s are still showing in GWT and Google search?
Hi guys, I've already added the following syntax in robots.txt to prevent search engines in crawling dynamic pages produce by my website's search feature: Disallow: /search/. But soft 404s are still showing in Google Webmaster Tools. Do I need to wait(it's been almost a week since I've added the following syntax in my robots.txt)? Thanks, JC
Technical SEO | | esiow20130 -
Is it bad to have same page listed twice in sitemap?
Hello, I have found that from an HTML (not xml) sitemap of a website, a page has been listed twice. Is it okay or will it be considered duplicate content? Both the links use same anchor text, but different urls that redirect to another (final) page. I thought ideal way is to use final page in sitemap (and in all internal linking), not the intermediate pages. Am I right?
Technical SEO | | StickyRiceSEO1 -
How to handle .mobi and normal website for mobile search and regular search
Hi, we have our regular website at jameda.de and a mobile only page at jameda.mobi Users on mobile devices will be automatically redirected to .mobi if they click on a link to jameda.de in the SERPs. What is the best practice to ensure, that Googlebot is indexing jameda.de and Googlebot Mobile is indexing jameda.mobi without duplicate content issues and having Link-Juice benefits on mobile search at the same time? Thanks a lot
Technical SEO | | jameda0