Search Console rejecting XML sitemap files as HTML files, despite them being XML
-
Hi Moz folks,
We have launched an international site that uses subdirectories for regions and have had trouble getting pages outside of USA and Canada indexed.
Google Search Console accounts have finally been verified, so we can submit the correct regional sitemap to the relevant search console account.
However, when submitting non-USA and CA sitemap files (e.g. AU, NZ, UK), we are receiving a submission error that states, "Your Sitemap appears to be an HTML page," despite them being .xml files, e.g. http://www.t2tea.com/en/au/sitemap1_en_AU.xml.
Queries on this suggest it's a W3 Cache plugin problem, but we aren't using Wordpress; the site is running on Demandware.
Can anyone guide us on why Google Search Console is rejecting these sitemap files? Page indexation is a real issue.
Many thanks in advance!
-
Thanks, both. We'll explore a better solution with Demandware.
-
agree
-
Quite sure that's the case. When I'm following the URL the site also redirects me to a normal page. What is likely is that the same thing is happening to the bots of Google.
-
Extra thought: We're wondering if it's a bigger issue involving the redirect mechanic? Currently, users from a specific country are automatically redirected to their respective locale (e.g. US users trying to access Australian URLs are redirected to /en/us/). Is there something in this where Googlebots aren't able to access AU, NZ and UK subdirectories and sitemap files because they're coming from North America?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Which product URL to include in Sitemaps?
Hi Does the product URL's in Sitemaps affect the sub-categories authority too? For example, if I have a product with 2 URL's and which have a canonical tag: **/brands/michael-kors/bags/**jet-set-double-zip-wallet/ **/women/accessories/wallets/**jet-set-double-zip-wallet/ If I make the main URL "/women/accessories/wallets/jet-set-double-zip-wallet/" and set that as the Canonical URL & list that URL in the XML Sitemap, will it also mean the "/women/accessories/wallets/" category will get more authority and increase it's power to rank? Thanks Frankie
Technical SEO | | Frankie-BTDublin0 -
Blocking subdomains with Robots.txt file
We noticed that Google is indexing our pre-production site ibweb.prod.interstatebatteries.com in addition to indexing our main site interstatebatteries.com. Can you all help shed some light on the proper way to no-index our pre-prod site without impacting our live site?
Technical SEO | | paulwatley0 -
Despite proper hreflang and lang attribute implementation using xml sitemaps, I'm seeing sitelinks from different countries. Any help please?
When someone searches for our brand in US, instead of only US links, users are served with canadian or iranian sitelinks. Despite we have properly implemented xml sitemaps with hreflangs, even we have implemented lang attribute in the head section of source code for every country. I'd be thankful for any advice.
Technical SEO | | eset0 -
What are the steps to submitting a sitemap for a blog?
We are in the process of a website migration and need to submit a site map for our website and blog. What are the steps to follow for submitting a site map for the blog? Can we submit with just the /blog URL or do we need to include each category?
Technical SEO | | Sable_Group0 -
Canonicalization of index.html - please help
I've read up on the subject but am new at this so I thought I would just put forth a simple question. We want our home page to be referred to as www.domain.com. We want the search engines to find and return this URl in search results. But the page has to have a name and the actual name is NOT to www.domain.com/index.html. This, I believe is what can cause duplicate cotnent issues (not really duplicate but perceived by the serach engines as duplicate content). Is it best to insert http://www.domain.com/" /> in the HEAD section of the index.html page or am I totally misunderstanding this concept?
Technical SEO | | TBKO0 -
Sitemap Creation
Hi I am looking for the best way to generate an XML sitemap for webmaster tools for my website http://www.cheapfindergames.com. I have come across http://www.xml-sitemaps.com/ but it only allows up to 500 links. Is there a PHP script that any experts could share that would create the XML map that I could upload please? Many Thanks
Technical SEO | | ocelot0 -
Is this tabbed implementation of SEO copy correct (i.e. good for getting indexed and in an ok spot in the html as viewed by search bots?
We are trying to switch to a tabbed version of our team/product pages at SeatGeek.com, but where all tabs (only 2 right now) are viewed as one document by the search engines. I am pretty sure we have this working for the most part, but would love some quick feedback from you all as I have never worked with this approach before and these pages are some of our most important. Resources: http://www.ericpender.com/blog/tabs-and-seo http://www.google.com/support/forum/p/Webmasters/thread?tid=03fdefb488a16343&hl=en http://searchengineland.com/is-hiding-content-with-display-none-legitimate-seo-13643 Sample in use: http://www.seomoz.org/article/search-ranking-factors **Old Version: ** http://screencast.com/t/BWn0OgZsXt http://seatgeek.com/boston-celtics-tickets/ New Version with tabs: http://screencast.com/t/VW6QzDaGt http://screencast.com/t/RPvYv8sT2 http://seatgeek.com/miami-heat-tickets/ Notes: Content not displayed stacked on browser when Javascript turned off, but it is in the source code. Content shows up in Google cache of new page in the text version. In our implementation the JS is currently forcing the event to end before the default behavior of adding #about in this case to the url string - this can be changed, should it be? Related to this, the developer made it so that typing http://seatgeek.com/miami-heat-tickets/#about directly into the browser does not go to the tab with copy, which I imagine could be considered spammy from a human review perspective (this wasn't intentional). This portion of the code is below the truncated view of the fetch as Googlebot, so we didn't have that resource. Are there any issues with hidden text / is this too far down in the html? Any/all feedback appreciated. I know our copy is old, we are in the process of updating it for this season.
Technical SEO | | chadburgess0 -
My domain does not come in the search results, what do I do?
Hi, I have a website called www.bollykings.com It had a pretty solid rank on google for a number of keywords but 4-5 months back, it was badly affected by the Panda update. Now it comes nowhere. I have started updating and posting new articles on it since the last two months. When I search for "bollykings" on Google.com, website does not come only in the first 40 results. What could this mean?
Technical SEO | | modifyed0