Sitemap issues 19 warnings
-
Hi Guys
I seem to be having a lot of sitemap issues.
1. I have 3 top level domains, all with the co.nz sitemap that was submitted
2. I'm in the midst of a site re-design so I'm unsure if I should be updating these now or when the new site goes live (in two weeks)
3. I have 19 warnings from GWT for the co.nz site and they gave me 3 examples looks like 404 errors however I'm not too sure and a bit green on my behalf to find out where the issues are and how to fix them. (it is also showing that 95 pages submitted and only 53 were indexed)
4. I generated recently 2 sitemaps for .com and com.au submitted these both to google and when i create i still see the co.nz sitemap
Would love some guidance around this.
Thanks
-
Glad it was useful!
-
Oh you are a genius yourself Bob Thanks for the great information!
I will look into this and let you know how I go, thanks a bunch you have really helped me move this along and weed out all the confusion!
-
Hi Justin,
In that case I would ask your developer to make the sitemap on the website update automatically (or generate a new one every day). And submit that link to webmaster tools. If he's a real genius he could add your blog pages from Wordpress to this sitemap aswell but I'm not sure if Wordpress has a hook for this.
Alternative options:
- Let him make the automatically updated sitemap for the custom part of the website and use this combined with the sitemap from the yoast plugin. You can upload both separated in Google Webmaster Tools. Make sure both got their own URL. In this case it’s all automated and is just as good as the previous method.
- Keep on updating your sitemap manually. Just make sure you don't use the yoast sitemap and include the blogposts in your sitemap from screaming frog since this would give double input. If you choose to refresh your sitemap manually I would disable the sitemap within the Yoast plugin and use the Screaming frog sitemap which should include your blog pages aswell.
Good luck and let me know if this works for you!
-
Thanks a lot Dirk, your help has been tremendous to my SEO efforts!!!
-
Hi Bob
Thanks alot for your response!
That makes a lot of sense. We use Wordpress only for the blog, but the main site is custom built and doesn't have an yoast plugin.
So I'm not sure how that will work, when I create the site map with screaming frog do I need to include the blog pages in screaming frog if I'm using the yoast plugin?
Thanks again for your help!
-
Yep - you'll have to upload the file to the server first.
Bob's suggestion to generate the sitemap via the Yoast plugin is an excellent idea.
rgds
Dirk
-
Hi Justin,
Thanks for the screenshots. Dirk's suggestion about screaming frog should be really helpful. This should give you an insight in the true 404 errors that a bot can encounter while crawling through your internal site structure.
Based on what I see I think your main problem is the manual updated sitemap. Whenever you change a page, add a new one or mix up some categories those changes won't apply to your sitemap. This creates a 404 error while those pages aren't linked to from your website and (without a sitemap) wouldn't give any 404 error messages in Google Webmaster Tools.
I saw you were using SEO by Yoast already, I suggest using their sitemap functionality. That should resolve the problem and save you work in the future since there is no need to manually update your sitemap again.
Let me know if this works!
-
Hi Justin,
Could you post a screenshot of the error message and any links pointing to this URL? This way we can identify what pages return a 404. If this are important pages on your website I would fix it right now, if it however are pages you don’t use or your visitors rarely see I would make sure you pick this up with the redesign. No point in fixing this now if things will change in the near future. Besides that, sitemaps help you get your website indexed, releasing this two weeks earlier won’t make a big difference for the number of indexed pages since you won’t change your internal link structure and website authority (both help you get more pages indexed).
About your last point, could you provide me with a screenshot of this as well? When I check zenory.com/sitemap.xml I find the .com sitemap, so that part seems fine.
_PS. I would suggest you change your update frequency in your sitemap. It now states monthly, it’s probably a good idea to set this much faster since there is a blog on your website as well. At the moment you are giving Google hints to only crawl your website a few times a month. Keep in mind that you can give different parts of your website a different change frequency. For example, I give pages with user generated content a much higher change frequency then pages we need to update manually. _
-
Hi Justin,
Are the url's going to change when you update the design? If they are not changing you can already update now.
It's not really abnormal to have only a certain % of the sitemap indexed - it could be that Google judges that a certain number of pages is too light in content to be indexed. 55% of url's indexed seems rather low.
Sitemap errors - check the url's that are listed as errors. If I am not mistaken, you use an external tool to generate the sitemaps. It could be that this tools puts all the internal links in the the sitemap; regardless of their status (200, 301, 404) - normally only url's with status 200 should be put in the sitemap. Check the configuration of the tool you use & see if you can only add url's with status 200. Alternatively, you can check the internal linking on your site & make sure that no links exist to 404 pages (Screaming Frog is the tool to use - it can also generate the sitemap).
For the wrong sitemap- as your sites are exact duplicates, probably hosted on the same server, it could be that the .co.nz sitemap overwrites the .com sitemap , as they have the same name. You could rename your sitemap like sitemap_au.xml, sitemap_us.xml & sitemap_nz.xml. This way, if you add a new sitemap for .nz it will not overwrite the .com version. You submit these to Google & you delete the old ones (both on the server & in Google WMT).
Hope this helps.
Dirk
PS. If your design is also changing the url's - don't forget to put redirects in place that lead the old to the new url's.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Third part http links on the page source: Social engineering content warning from Google
Hi, We have received "Social engineering content" warning from Google and one of our important page and it's internal pages have been flagged as "Deceptive site ahead". We wonder what's the reason behind this as Google didn't point exactly to the specific part of the page which made us look so to the Google. We don't employ any such content on the page and the content is same for many months. As our site is WP hosted, we used a WordPress plugin for this page's layout which injected 2 http (non-https) links in our page code. We suspect if this is the reason behind this? Any ideas? Thanks
White Hat / Black Hat SEO | | vtmoz1 -
Will pillar posts create a duplication content issue, if we un-gate ebook/guides and use exact copy from blogs?
Hi there! With the rise of pillar posts, I have a question on the duplicate content issue it may present. If we are un-gating ebook/guides and using (at times) exact copy from our blog posts, will this harm our SEO efforts? This would go against the goal of our post and is mission-critical to understand before we implement pillar posts for our clients.
White Hat / Black Hat SEO | | Olivia9540 -
Robots.txt file in Shopify - Collection and Product Page Crawling Issue
Hi, I am working on one big eCommerce store which have more then 1000 Product. we just moved platform WP to Shopify getting noindex issue. when i check robots.txt i found below code which is very confusing for me. **I am not getting meaning of below tags.** Disallow: /collections/+ Disallow: /collections/%2B Disallow: /collections/%2b Disallow: /blogs/+ Disallow: /blogs/%2B Disallow: /blogs/%2b I can understand that my robots.txt disallows SEs to crawling and indexing my all product pages. ( collection/*+* ) Is this the query which is affecting the indexing product pages? Please explain me how this robots.txt work in shopify and once my page crawl and index by google.com then what is use of Disallow: Thanks.
White Hat / Black Hat SEO | | HuptechWebseo0 -
Duplicate content warning: Same page but different urls???
Hi guys i have a friend of mine who has a site i noticed once tested with moz that there are 80 duplicate content warnings, for instance Page 1 is http://yourdigitalfile.com/signing-documents.html the warning page is http://www.yourdigitalfile.com/signing-documents.html another example Page 1 http://www.yourdigitalfile.com/ same second page http://yourdigitalfile.com i noticed that the whole website is like the nealry every page has another version in a different url?, any ideas why they dev would do this, also the pages that have received the warnings are not redirected to the newer pages you can go to either one??? thanks very much
White Hat / Black Hat SEO | | ydf0 -
Site redesign what to consider to avoid any issues
Hi GUYS I want to avoid getting myself into a bad situation with google, so I'm just wanting to know if there are any steps I would need to take whilst I'm redesigning and developing my site as I'm currently deploying our new designs. One thing I noticed, i have my new designs and content on our development server to run through any checks before deploying it to the live environment, however while our live site is up, I have duplicate content on the live site that exactly matches the dev site for obvious reasons but do I need to tell google that the dev site is for development purposes only so google knows I'm not duplicating content? I have searched around to find some more info about this, if anyone has some insight i would be glad to know your thoughts. Thank you in advance
White Hat / Black Hat SEO | | edward-may0 -
Does Lazy Loading Create Indexing Issues of products?
I have store with 5000+ products in one category & i m using Lazy Loading . Does this effects in indexing these 5000 products. as google says they index or read max 1000 links on one page.
White Hat / Black Hat SEO | | innovatebizz0 -
Partial Sitemaps Impact on SERP
I have a website having 20 different categories. But have the sitemap for only 1 category and rest 19 categories will not have the sitemaps will this have an impact on the search results on not
White Hat / Black Hat SEO | | seosogo0 -
"Unnatural Linking" Warning/Penalty - Anyone's company help with overcoming this?
I have a few sites where I didn't manage the quality of my vendors and now am staring at some GWT warnings for unnatural linking. I'm assuming a penalty is coming down the pipe and unfortunately these aren't my sites so looking to get on the ball with unwinding anything we can as soon as possible. Does anyone's company have experience or could pass along a reference to another company who successfully dealt with these issues? A few items coming to mind include solid and speedy processes to removing offending links, and properly dealing with the resubmission request?
White Hat / Black Hat SEO | | b2bmarketer0