Sitemap issues
-
Hi ALL
Okay I'm a bit confused here, but it says I have submitted 72 (pages) im assuming and its returning only (2 pages) have been indexed?
I submitted a new site map for each of my 3 top level domains and checked it today and its showing this result attached.
We are still having issues with meta tags showing up in the incorrect country.
If anyone knows how I can attend to this knightmare would be much appreciated lol
-
Awesome response Dirk! Thanks again for your endless help!
-
Hey again!
...I can't believe I didn't think of the simplicity of this earlier...
"it's even faster if you sort the url's in alphabetical order & delete the rows containing priority / lastmod /.. - then you only need to do a find/replace on the <loc>/</loc> "
100,000 spreadsheets and I don't even think of sorting for this task. Unreal. I laughed at myself aloud when I read it.
Thank you per usual my friend!
-
Hi Patrick,
Your method is a good one - I use more or less the same trick to retrieve url's from a sitemap in xls (it's even faster if you sort the url's in alphabetical order & delete the rows containing priority / lastmod /.. - then you only need to do a find/replace on the <loc>/</loc> )
It's just in this specific case as the sitemap was generated in Screaming Frog that it's easier to eliminate these redirected url's upfront.
Dirk
-
Thanks so much Dirk - this is great. I was speaking to how I found the specific errors. Thanks for posting this for the sitemap - definitely left a big chunk out on my part!
-
Hi Justin
The how-to of Patrick is correct - but as you are generating your sitemap using Screaming Frog there is really no need to go through this manual processing.
If you only need to create the sitemap:
Go to Configuration > Spider -
Tab: Basic settings: uncheck everything apart from "Crawl Canonicals" (unchecking Images/CSS/JS/External links is not strictly necessary but speeds up the crawl)Advanced: Check "Always Follow redirects" / "Respect Noindex" / "Respect Canonical"
After the crawl - generate the sitemap - it will now only contain the "final" url's - after the redirects.
Hope this helps,
Dirk
PS Try to avoid internal links which are redirected - better to replace these links by links to the final destination
-
Hi Justin
Probably the easiest way to eliminate these cross references is to ask your programmer to put all links as relative links rather than as absolute links. Relative links have the disadvantage that they can generate endless loops if something is wrong with the HTML - but this is something you can easily check with Screaming Frog.
If you check the .com version - example https://www.zenory.com/blog/tag/love/ -it's calling zenory.co.nz for plenty of links (just check the source & search for .co.nz) - both the http & the https version
You can check all these pages by hand - but I guess your programmer must be able to do this in an automated way.
It is also the case the other way round- on the .co.nz version - you'll find references in the source to the .com version
In screaming frog - the links with "NZ" are the only ones which should stay absolute - as they point to the other version
Hope this clarifies
Dirk
-
Wow thanks Patrick, let me run this and see how I go, thanks so much for your help!!!
-
Dirk, thanks so much for your help!
Could you tell me how to identify with the urls that are cross referencing - I tried using screaming frog and I found under the **external and clicked on inlinks and outlinks. **But whats really caught my eye, is alot of the links are from the blog with the same anchor text "name" others are showing up as a different name as well. Some are saying NZ NZ or AU AU as the anchor text and I think this has to do with the flag drop down to change the top level domains.
For eg:
FROM: https://www.zenory.co.nz/blog/tag/love/
TO: https://www.zenory.com.au/categories/love-relationships
Anchor Text: Twinflame Reading
-
Hi Justin
Yep! I use ScreamingFrog, here's how I do it:
-
Goto your /sitemap.xml
-
Select all + copy
-
Paste into Excel column A
-
Select column A
-
Turn "Wrap Text" off
-
Delete rows 1 through 5
-
Select column A again
-
"Find and Replace" the following:
-
<lastmod></lastmod>
-
<changefreq></changefreq>
-
daily
-
Whatever the date is
-
Priority numbers, usually 0.5 to 1.0
-
"Replace With" nothing, no spaces, nothing
-
You'll hit "Replace All" after every text string you put in, one at a time
-
With Column A still select, hit F5
-
Click "Special"
-
Click "Blank" and "Ok"
-
Right click in the spreadsheet
-
Select "Delete" and "Shift Rows Up"
Walla! You have your list. Now copy this list, and open ScreamingFrog. Click "Mode" up top and click "List". Click "Upload List" and click "Paste". Paste your URLs in there and hit Start.
Your sitemap will be crawled.
Here are URLs that returned 301 redirects:
https://www.zenory.com/blog/chat-psychic-readings/
https://www.zenory.com/blog/online-psychic-readings-private/
https://www.zenory.com/blog/live-psychic-readings/
https://www.zenory.com/blog/online-psychic-readings/Here are URLs that returned 503 Service Unavailable codes twice, but 200s now:
https://www.zenory.com/blog/spiritually-love/
https://www.zenory.com/blog/automatic-writing-psychic-readings/
https://www.zenory.com/blog/author/psychic-nori/
https://www.zenory.com/blog/zodiac-signs/
https://www.zenory.com/blog/soul-mate-relationship-challenges/
https://www.zenory.com/blog/author/zenoryadmin/
https://www.zenory.com/blog/soulmate-separation-break-ups/
https://www.zenory.com/blog/how-to-find-a-genuine-psychic/
https://www.zenory.com/blog/mind-body-soul/
https://www.zenory.com/blog/twin-flame-norishing-your-flame-to-find-its-twin/
https://www.zenory.com/blog/tips-psychic-reading/
https://www.zenory.com/blog/tips-to-dealing-with-a-broken-heart/
https://www.zenory.com/blog/the-difference-between-soul-mates-and-twin-flames/
https://www.zenory.com/blog/sex-love/
https://www.zenory.com/blog/psychic-advice-break-ups/
https://www.zenory.com/blog/author/ginny/
https://www.zenory.com/blog/chanelling-psychic-readings/
https://www.zenory.com/blog/first-release-cycle-2015/
https://www.zenory.com/blog/psychic-shaman-readings/
https://www.zenory.com/blog/chat-psychic-readings/
https://www.zenory.com/blog/psychic-medium-psychic-readings/
https://www.zenory.com/blog/author/trinity/
https://www.zenory.com/blog/psychic-readings-karmic-relationships/
https://www.zenory.com/blog/can-psychic-readings-heal-broken-heart/
https://www.zenory.com/blog/guidance-psychic-readings/
https://www.zenory.com/blog/mercury-retrograde-effects-life/
https://www.zenory.com/blog/online-psychic-readings-private/
https://www.zenory.com/blog/psychics-mind-readers/
https://www.zenory.com/blog/angel-card-readings-psychic-readings/
https://www.zenory.com/blog/cheating-relationship/
https://www.zenory.com/blog/long-distance-relationship/
https://www.zenory.com/blog/soulmate-psychic-reading/
https://www.zenory.com/blog/live-psychic-readings/
https://www.zenory.com/blog/psychic-readings-using-rune-stones/
https://www.zenory.com/blog/psychic-clairvoyant-psychic-readings/
https://www.zenory.com/blog/psychic-guidance-long-distance-relationships/
https://www.zenory.com/blog/author/libby/
https://www.zenory.com/blog/online-psychic-readings/I would check on that when you can. Check in Webmaster Tools if any issues have arrived there as well.
Hope this helps! Good luck!
-
-
Thanks so much Patrick! Can you recommend how I would go about finding the urls that are redirecting in the sitemap? I'm assuming screaming frog?
-
Hi Justin
Google doesn't seem to be figuring out (even with the correct hreflang in place) which site should be shown for each country.
If you look at the cached versions of your .com.au & .com versions it always the .co.nz version which is cached - this is probably also the reason why the meta description is wrong (it's always coming from the .co.nz version) and why the % of url's indexed for each sitemap (for the .com & .com.au version) is so low.
Try to rigorously eliminate all cross-references in your site - to make it more obvious for Google that these are 3 different sites:
-
in the footer - the links in the second column are pointing to the .co.nz version (latest articles) - change these links to relative ones
-
on all sites there are elements you load from the .com domain (see latest blog entries - the images are loaded from the .com domain for all tld's
As long as you send these confusing signals to Google - Google will mix up the different versions of your site.
rgds,
Dirk
-
-
Hi there Justin
Everything looks fine from here - there are a couple URLs that need to be updated in your sitemap as they are redirecting.
Google takes time to index, so give this a little more time. You could ask Google to recrawl your URLs but that's very unnecessary at the moment; just something to note.
I would make sure your internal links are all good to go and "follow" so that crawlers can at least find URLs that way.
I did a quick site: search on Google, so far you have 58 pages indexed. You should be okay.
Hope this helps! Good luck!
-
Hi Justin,
Similar question asked in this post @ http://moz.com/community/q/webmaster-tools-indexed-pages-vs-sitemap
Hope this helps you.
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
A website with some guidelines points similar - is this creates an issue?
Hey Guys, Please clarify my doubt at the earliest. We just revamped the website with new content and hired a content writer for our services page to make it done. I just came across with 2 pages with similar guidelines over the content. These are the pages showing some similarity of bulletins. Please take a look on it and give the reply, it creates any ranking issues or not. Page-1: https://www.socprollect-mea.com/business-setup-in-ajman/ Page-2: https://www.socprollect-mea.com/business-registration-in-ajman-free-zones/ Reply ASAP
White Hat / Black Hat SEO | | nazfazy0 -
"Google chose different canonical than user" Issue Can Anyone help?
Our site https://www.travelyaari.com/ , some page are showing this error ("Google chose different canonical than user") on google webmasters. status message "Excluded from search results". Affected on our route page urls mainly. https://www.travelyaari.com/popular-routes-listing Our canonical tags are fine, rel alternate tags are fine. Can anyone help us regarding why it is happening?
White Hat / Black Hat SEO | | RobinJA0 -
Active, Old Large site with SEO issues... Fix or Rebuild?
Looking for opinions and guidance here. Would sincerely appreciate help. I started a site long, long ago (1996 to be exact) focused on travel in the US. The site did very well in the search results up until panda as I built it off templates using public databases to fill in the blanks where I didn't have curated content. The site currently indexes around 310,000 pages. I haven't been actively working on the site for years and while user content has kept things somewhat current, I am jumping back into this site as it provides income for my parents (who are retired). My questions is this. Will it be easier to track through all my issues and repair, or rebuild as a new site so I can insure everything is in order with today's SEO? and bonus points for this answer ... how do you handle 301 redirects for thousands of incoming links 😕 Some info to help: CURRENTLY DA is in the low 40s some pages still rank on first page of SERPs (long-tail mainly) urls are dynamic (I have built multiple versions through the years and the last major overhaul was prior to CMS popularity for this size of site) domain is short (4 letters) but not really what I want at this point Lots of original content, but oddly that content has been copied by other sites through the years WHAT I WANT TO DO get into a CMS so that anyone can add/curate content without needing tech knowledge change to a more relevant domain (I have a different vision) remove old, boilerplate content, but keep original
White Hat / Black Hat SEO | | Millibit1 -
Should I submit a sitemap for a site with dynamic pages?
I have a coupon website (http://couponeasy.com)
White Hat / Black Hat SEO | | shopperlocal_DM
Being a coupon website, my content is always keeps changing (as new coupons are added and expired deals are removed) automatically. I wish to create a sitemap but I realised that there is not much point in creating a sitemap for all pages as they will be removed sooner or later and/or are canonical. I have about 8-9 pages which are static and hence I can include them in sitemap. Now the question is.... If I create the sitemap for these 9 pages and submit it to google webmaster, will the google crawlers stop indexing other pages? NOTE: I need to create the sitemap for getting expanded sitelinks. http://couponeasy.com/0 -
More sitemap issues: help
Hey Guys, Seems I'm having more sitemap issues -I just checked my WMT and find that for my com.au and com site - the com.au site is showing i only have 2 pages indexed and 72 Web Pages submitted. The .com I look under sitemaps and it doesn't show any results as to how many pages have been indexed instead it is giving me this error warning - "Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead." All 3 sites are listed here: http://bit.ly/1KTbWg0 http://bit.ly/1AU0f5k http://bit.ly/1yhz96v Any advice would be much appreciate here! Thanks guys
White Hat / Black Hat SEO | | edward-may0 -
Want to know Best Method to fix keyword cannibalization issue?
I have a website that has been experiencing keyword cannibalization issue since last 2-3 months. We have one main key search term to bring our website TOP ranking, but we have been seeing our website’s 2 different pages ranking strangely sometime for 1st page& sometime for 2nd page that one main key search term. As e.g.:
White Hat / Black Hat SEO | | Aman_123
our main key search term 1st page rank sometime instead 2nd page
our main key search term 2nd page rank sometime instead page I am looking for best solution here to get this fixed..0 -
Google Sitemaps & punishment for bad URLS?
Hoping y'all have some input here. This is along story, but I'll boil it down: Site X bought the url of Site Y. 301 redirects were added to direct traffic (and help transfer linkjuice) from urls in Site X to relevant urls in Site Y, but 2 days before a "change of address" notice was submitted in Google Webmaster Tools, an auto-generating sitemap somehow applied urls from Site Y to the sitemap of Site X, so essentially the sitemap contained urls that were not the url of Site X. Is there any documentation out there that Google would punish Site X for having essentially unrelated urls in its sitemap by downgrading organic search rankings because it may view that mistake as black hat (or otherwise evil) tactics? I suspect this because the site continues to rank well organically in Yahoo & Bing, yet is nonexistent on Google suddenly. Thoughts?
White Hat / Black Hat SEO | | RUNNERagency0