I need an XML sitemap expert for 5 minutes!
-
Hi all!
I'm hoping that someone with a lot of experience with XML sitemaps can help me out here...
When submitting my sitemap in Google Webmaster Tools, these are the results:
2,414,714 Submitted
34,721 IndexedAnd there's also tonnes of warnings.
Would anyone be able to take a quick look at these sitemaps to perhaps advise me on what's going wrong there? These do not load without the www, not sure if this is an issue?
http://www.eumom.ie/sitemap.xml
http://www.eumom.ie/sitemap.xml.gzThanks everyone in advance!!
Gavin
-
Few rules about sitemaps;
-
You should only include in them pages you also want crawled and indexed
-
They should not contain URLs with 404s or blocked by robots.txt
My guess is there are too many URLs in the sitemaps, since I'd guess the website is not over 2 million actual "real" pages,
Also, I randomly clicked on a URL in one of the sitemaps and it 404'd;
http://www.eumom.ie/forums/topic/oakhill-school-leopardstown-/
This is probably causing a lot of the errors you see. It's honestly not a 5 minute fix - but if it were my site, I would be using the Yoast SEO plugin and using the sitemap feature within Yoast. It makes it very easy to include / exclude certain pages and updated automatically etc.
I think there must be a way to tell your plugin what to include / exclude from the sitemap but I don't have as much experience with it.
But generally - only include pages you want crawled and indexed. Don't include pages that 404.
-
-
Hi all,
Many thanks for your input so far, much appreciated!
The sitemaps that you are seeing actually were generated using that plugin you mentioned. Formatting-wise, do you see anything wrong with the sitemaps?
Thanks!!
Gavin -
I couldn't agree more altecdesign!
http://wordpress.org/plugins/google-sitemap-generator/ all the way!
-
That XML sitemap you linked too is formatted in an odd way. I noticed the site you are generating the xml sitemap for is based in wordpress. There is a really solid sitemap plugin you could use to generate your XML and submit to google instead of the current plugin you are using: http://wordpress.org/plugins/google-sitemap-generator/
I've used that plugnin numerous times and submitted sitemaps to google with no errors. Hopefully that helps you out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
General questions about implementing hreflang using XML sitemap
I created another thread regarding hreflang sitemaps. However, this one is more general and doesn't cover multiple sitemaps for different localizations so I think it's reasonable creating a new thread. We are trying to implement hreflang using XML sitemap. We have localized content for a few countries, but only 1/3 of the content is 'duplicate' localized content. How should this be presented in the sitemap? Can we have some urls with hreflang-tags and some without? Also, where should this be located? In the usual sitemap file at site.com/sitemap.xml or should we create a different sitemap site.com/hreflang.xml where we just paste all hreflang-info? And if it should be in /hreflang.xml - can we have the same URL twice (in both current sitemap and hreflang sitemap)?
Technical SEO | | Telsenome0 -
Href lang issues - help needed!
Hi, I have an issue with Google indexing the US version of our website rather than the UK version on Google.co.uk. I have added hreflang tags to both sites (https://www.pacapod.com/ and https://us.pacapod.com/), have updated and submitted an XML sitemap for each website and checked that the country targeting in search console is set-up correctly but Google are still indexing the wrong website. I would be grateful for any assistance with this issue. Many thanks Eddie
Technical SEO | | mypetgiftbox0 -
Google only crawling a small percentage of the sitemap
Hi, The company which I work for have developed a new website for a customer, there URL is https://www.wideformatsolutions.co.uk I've created a sitemap which has 25,555 URL's. I submitted this to Google around 4 weeks ago and the most crawls that have ever occurred has been 2,379. I've checked everything I can think of, including; Speed of website Canonical Links 404 errors Setting a preferred domain Duplicate content Robots Txt .htaccess Meta Tags I did read that Matt Cutts revealed in an interview with Eric Enge that the number of pages Google crawls is roughly proportional to your pagerank. But I'm sure it should crawl more than 2000 pages. The website is based on Opencart, if anyone has experienced anything like this I would love hear from you.
Technical SEO | | chrissmithps0 -
Why is robots.txt blocking URL's in sitemap?
Hi Folks, Any ideas why Google Webmaster Tools is indicating that my robots.txt is blocking URL's linked in my sitemap.xml, when in fact it isn't? I have checked the current robots.txt declarations and they are fine and I've also tested it in the 'robots.txt Tester' tool, which indicates for the URL's it's suggesting are blocked in the sitemap, in fact work fine. Is this a temporary issue that will be resolved over a few days or should I be concerned. I have recently removed the declaration from the robots.txt that would have been blocking them and then uploaded a new updated sitemap.xml. I'm assuming this issue is due to some sort of crossover. Thanks Gaz
Technical SEO | | PurpleGriffon0 -
Deal with links that need login to view
Hi All, Deal with links that need login to view We have member names in the site in many places and when clicked it takes the user to the login page As just logged in members can view the details The redirection type is 302 and Moz Campaign says we have many and need to make them 301 What is the best way as we have a drupal website Thanks
Technical SEO | | mtthompsons0 -
Is a Rel="cacnonical" page bad for a google xml sitemap
Back in March 2011 this conversation happened. Rand: You don't want rel=canonicals. Duane: Only end state URL. That's the only thing I want in a sitemap.xml. We have a very tight threshold on how clean your sitemap needs to be. When people are learning about how to build sitemaps, it's really critical that they understand that this isn't something that you do once and forget about. This is an ongoing maintenance item, and it has a big impact on how Bing views your website. What we want is end state URLs and we want hyper-clean. We want only a couple of percentage points of error. Is this the same with Google?
Technical SEO | | DoRM0 -
Duplicate Page title - PHP Experts!
After running a crawl diagnostics i was surprised to see 336 duplicate page titles. But i am wondering if it is true or not. Most of them are not a page at all but a .php variation. for example: The following are all the same page, but a different limit on viewing listings. Limiting your view to 5, 10, 15, 20, 25 as you choose. .com/?lang=en&limit=5 .com/?lang=en&limit=5&limitstart=10
Technical SEO | | nahopkin
.com/?lang=en&limit=5&limitstart=15
.com/?lang=en&limit=5&limitstart=20
.com/?lang=en&limit=5&limitstart=25 Same type of things are going on all over the site causing 228 duplicate content errors and the already mentioned 336 duplicate pages. But is "crawl diagnostic telling the truth" or is it just some php thing? I am not a php expert so any input would be much appreciated. What should i do?0 -
I Need advice in redirecting domains
I have tow domains (destination/town - travel websites), www.gansbaai.com, and www.danger-point-peninsula.co.za. The one, gansbaai.com is an old domain I bought under which I will be launching a new website in a couple of months. danger-point-peninsula.co.za, is another domain I acquired also about gansbaai, the area. I will we using the domain gansbaai.cm, but want to get the best link juice out of danger-point-peninsula. How do I merge the domains?
Technical SEO | | DROIDSTERS0