WMT only showing half of a newly submitted XML site map
-
After upgrading design and theme on a relatively high traffic wordpress site, I created an XML site map through Yoast SEO since WP Engine didn't allow the old XML site map plugin I was using.
A site:www.mysite.com search shows Google is indexing about 1,100 pages on my site, yet the XML site map I submitted shows "458 URLs submitted and 467 URLs indexed."
These numbers are about 1/2 of what they should be. My old site map had about 1,100 URLs and 965 or so indexed (used noindex on some low value pages.)
Any ideas as to what may be wrong?
-
I just did a site: search for your domain and looks like 1140 pages are indexed, so I'm assuming this got itself settled?
Congrats! Marking as answered.
-
You wont get a duplicate penalty, having duplicate content is not a crime unless you are doing some large scale spamming. duplicate content wont help but it wont hurt either. noindexing will hurt, even with follow you still lose some. Use canonical to fix your problem not noindex.
as for the sitemap, It is my suspicion that not al the maps are being read. I also don't know much about yoast sitemaps, I always us the xml standard.
Bing and Google have their own sitmap generation software, that you can use that lets them make your site map for you.
-
Thanks Alan,
Sure, here is the site map: http://www.nationalbankruptcyforum.com/sitemap_index.xml
As far as noindexing pages is concerned, I always use noindex, follow, but choose to noindex category and author archive pages as I think they can cause duplicate content/ Panda issues.
John
-
Can we see your sitemap.xml to look for any problems.
I would not be concerned, as sitemaps are not much help for sites that have good linking, a site map should not include all your links according to Duane forrester of bing, but the main pages only.
What is a concern is the noindexing of pages you mention. any links pointing to non indexed pages are wasting their link juice, there is nothing to gain by noindexing pages but a lot to lose. if you really mush noindex a page use the meta tag noindex,foloow, so the search engine follows the links and you will get some of the link juice back.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Webmaster tools showing 200 page load ok - all other testing tools show a 301
hey, on https://www.xxx.co we've setup a 301 redirect to xxx.us - > BUT in webmaster tools its still showing a 200 load ok, whereas on all other testing tools its showing a 301 redirect (screamingfrog etc) even https://dns.google.com/query?name=www.xxx.co is showing that its 301 redirected. Any ideas? as we want to trigger the change of address tool in WMT and its saying it cant as it loads the homepage still....
Technical SEO | | RobertN-London0 -
URL Indexed But Not Submitted to Sitemap
Hi guys, In Google's webmaster tool it says that the URL has been indexed but not submitted to the sitemap. Is it necessary that the URL be submitted to the sitemap if it has already been indexed? Appreciate your help with this. Mark
Technical SEO | | marktheshark100 -
Confused about repeated occurences of URL/essayorg/topic/ showing up as 404 errors in our site logs
Working on a Wordpress website, https://thedoctorwithin.comScanning the site’s 404 errors, I’m seeing a lot of searches for URL/essayorg/topic, coming from Bingbot, as well as other spiders (Google, OpensiteExlorer). We get at least 200 of these irrelevant requests per week. Seems like each topic that follows /essayorg/ is unique. Some include typos: /dissitation/Haven't done a verification to make sure the spiders are who they say they are, yet.Almost seems like there are many links ‘in the wild’ intended for Essay.Org that are being directed towards the site I’m working on.I've considered redirecting any requests for URL/essayorg/ to our sitemap… figuring that might encourage further spidering of actual site content. Is redirection to our sitemap xml file a good idea, or might doing so have unintended consequences? Interested in suggestions about why this might be occurring. Thank you.
Technical SEO | | linkjuiced0 -
When will all of Google Maps be the same again?
As many of you are aware that the pigeon update was only applied to the new Google maps resulting in very different search results for Google local business. When you search for a business on old Google maps then you get totally different results vs the new Google maps. Some businesses totally disappeared completely from the search results. I have done my research and found out that it's because the new Algo was only applied to the new maps. Also new algo does not apply to other countries. Well the reason I posted this topic is because I have noticed that all the new Google Business listings I am verifying for my clients are all being put under the old Google maps and not the new ones. They come up fine when searching from old maps but not the new ones. I understand Google has not rolled out the pigeon on all data centers but why? Will Google eventually roll out the update to old maps? Since Google is adding businesses to old google maps then what's the point of even adding new listings?
Technical SEO | | bajaseo0 -
WMT "Index Status" vs Google search site:mydomain.com
Hi - I'm working for a client with a manual penalty. In their WMT account they have 2 pages indexed.If I search for "site:myclientsdomain.com" I get 175 results which is about right. I'm not sure what to make of the 2 indexed pages - any thoughts would be very appreciated. google-1.png google-2.png
Technical SEO | | JohnBolyard0 -
Duplicate Content - Mobile Site
We think that a mobile version of our site is causing a duplicate content issue; what's the best way to stop the mobile version being indexed. Basically the site forwards mobile users to "/mobile" which is just a mobile optimised version of the original site. Is it best to block the /mobile folder from being crawled?
Technical SEO | | nsmith7870 -
Robots.txt blocking site or not?
Here is the robots.txt from a client site. Am I reading this right --
Technical SEO | | 540SEO
that the robots.txt is saying to ignore the entire site, but the
#'s are saying to ignore the robots.txt command? See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file To ban all spiders from the entire site uncomment the next two lines: User-Agent: * Disallow: /0 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0