Sitemap issues
-
Hi ALL
Okay I'm a bit confused here, but it says I have submitted 72 (pages) im assuming and its returning only (2 pages) have been indexed?
I submitted a new site map for each of my 3 top level domains and checked it today and its showing this result attached.
We are still having issues with meta tags showing up in the incorrect country.
If anyone knows how I can attend to this knightmare would be much appreciated lol
-
Awesome response Dirk! Thanks again for your endless help!
-
Hey again!
...I can't believe I didn't think of the simplicity of this earlier...
"it's even faster if you sort the url's in alphabetical order & delete the rows containing priority / lastmod /.. - then you only need to do a find/replace on the <loc>/</loc> "
100,000 spreadsheets and I don't even think of sorting for this task. Unreal. I laughed at myself aloud when I read it.
Thank you per usual my friend!
-
Hi Patrick,
Your method is a good one - I use more or less the same trick to retrieve url's from a sitemap in xls (it's even faster if you sort the url's in alphabetical order & delete the rows containing priority / lastmod /.. - then you only need to do a find/replace on the <loc>/</loc> )
It's just in this specific case as the sitemap was generated in Screaming Frog that it's easier to eliminate these redirected url's upfront.
Dirk
-
Thanks so much Dirk - this is great. I was speaking to how I found the specific errors. Thanks for posting this for the sitemap - definitely left a big chunk out on my part!
-
Hi Justin
The how-to of Patrick is correct - but as you are generating your sitemap using Screaming Frog there is really no need to go through this manual processing.
If you only need to create the sitemap:
Go to Configuration > Spider -
Tab: Basic settings: uncheck everything apart from "Crawl Canonicals" (unchecking Images/CSS/JS/External links is not strictly necessary but speeds up the crawl)Advanced: Check "Always Follow redirects" / "Respect Noindex" / "Respect Canonical"
After the crawl - generate the sitemap - it will now only contain the "final" url's - after the redirects.
Hope this helps,
Dirk
PS Try to avoid internal links which are redirected - better to replace these links by links to the final destination
-
Hi Justin
Probably the easiest way to eliminate these cross references is to ask your programmer to put all links as relative links rather than as absolute links. Relative links have the disadvantage that they can generate endless loops if something is wrong with the HTML - but this is something you can easily check with Screaming Frog.
If you check the .com version - example https://www.zenory.com/blog/tag/love/ -it's calling zenory.co.nz for plenty of links (just check the source & search for .co.nz) - both the http & the https version
You can check all these pages by hand - but I guess your programmer must be able to do this in an automated way.
It is also the case the other way round- on the .co.nz version - you'll find references in the source to the .com version
In screaming frog - the links with "NZ" are the only ones which should stay absolute - as they point to the other version
Hope this clarifies
Dirk
-
Wow thanks Patrick, let me run this and see how I go, thanks so much for your help!!!
-
Dirk, thanks so much for your help!
Could you tell me how to identify with the urls that are cross referencing - I tried using screaming frog and I found under the **external and clicked on inlinks and outlinks. **But whats really caught my eye, is alot of the links are from the blog with the same anchor text "name" others are showing up as a different name as well. Some are saying NZ NZ or AU AU as the anchor text and I think this has to do with the flag drop down to change the top level domains.
For eg:
FROM: https://www.zenory.co.nz/blog/tag/love/
TO: https://www.zenory.com.au/categories/love-relationships
Anchor Text: Twinflame Reading
-
Hi Justin
Yep! I use ScreamingFrog, here's how I do it:
-
Goto your /sitemap.xml
-
Select all + copy
-
Paste into Excel column A
-
Select column A
-
Turn "Wrap Text" off
-
Delete rows 1 through 5
-
Select column A again
-
"Find and Replace" the following:
-
<lastmod></lastmod>
-
<changefreq></changefreq>
-
daily
-
Whatever the date is
-
Priority numbers, usually 0.5 to 1.0
-
"Replace With" nothing, no spaces, nothing
-
You'll hit "Replace All" after every text string you put in, one at a time
-
With Column A still select, hit F5
-
Click "Special"
-
Click "Blank" and "Ok"
-
Right click in the spreadsheet
-
Select "Delete" and "Shift Rows Up"
Walla! You have your list. Now copy this list, and open ScreamingFrog. Click "Mode" up top and click "List". Click "Upload List" and click "Paste". Paste your URLs in there and hit Start.
Your sitemap will be crawled.
Here are URLs that returned 301 redirects:
https://www.zenory.com/blog/chat-psychic-readings/
https://www.zenory.com/blog/online-psychic-readings-private/
https://www.zenory.com/blog/live-psychic-readings/
https://www.zenory.com/blog/online-psychic-readings/Here are URLs that returned 503 Service Unavailable codes twice, but 200s now:
https://www.zenory.com/blog/spiritually-love/
https://www.zenory.com/blog/automatic-writing-psychic-readings/
https://www.zenory.com/blog/author/psychic-nori/
https://www.zenory.com/blog/zodiac-signs/
https://www.zenory.com/blog/soul-mate-relationship-challenges/
https://www.zenory.com/blog/author/zenoryadmin/
https://www.zenory.com/blog/soulmate-separation-break-ups/
https://www.zenory.com/blog/how-to-find-a-genuine-psychic/
https://www.zenory.com/blog/mind-body-soul/
https://www.zenory.com/blog/twin-flame-norishing-your-flame-to-find-its-twin/
https://www.zenory.com/blog/tips-psychic-reading/
https://www.zenory.com/blog/tips-to-dealing-with-a-broken-heart/
https://www.zenory.com/blog/the-difference-between-soul-mates-and-twin-flames/
https://www.zenory.com/blog/sex-love/
https://www.zenory.com/blog/psychic-advice-break-ups/
https://www.zenory.com/blog/author/ginny/
https://www.zenory.com/blog/chanelling-psychic-readings/
https://www.zenory.com/blog/first-release-cycle-2015/
https://www.zenory.com/blog/psychic-shaman-readings/
https://www.zenory.com/blog/chat-psychic-readings/
https://www.zenory.com/blog/psychic-medium-psychic-readings/
https://www.zenory.com/blog/author/trinity/
https://www.zenory.com/blog/psychic-readings-karmic-relationships/
https://www.zenory.com/blog/can-psychic-readings-heal-broken-heart/
https://www.zenory.com/blog/guidance-psychic-readings/
https://www.zenory.com/blog/mercury-retrograde-effects-life/
https://www.zenory.com/blog/online-psychic-readings-private/
https://www.zenory.com/blog/psychics-mind-readers/
https://www.zenory.com/blog/angel-card-readings-psychic-readings/
https://www.zenory.com/blog/cheating-relationship/
https://www.zenory.com/blog/long-distance-relationship/
https://www.zenory.com/blog/soulmate-psychic-reading/
https://www.zenory.com/blog/live-psychic-readings/
https://www.zenory.com/blog/psychic-readings-using-rune-stones/
https://www.zenory.com/blog/psychic-clairvoyant-psychic-readings/
https://www.zenory.com/blog/psychic-guidance-long-distance-relationships/
https://www.zenory.com/blog/author/libby/
https://www.zenory.com/blog/online-psychic-readings/I would check on that when you can. Check in Webmaster Tools if any issues have arrived there as well.
Hope this helps! Good luck!
-
-
Thanks so much Patrick! Can you recommend how I would go about finding the urls that are redirecting in the sitemap? I'm assuming screaming frog?
-
Hi Justin
Google doesn't seem to be figuring out (even with the correct hreflang in place) which site should be shown for each country.
If you look at the cached versions of your .com.au & .com versions it always the .co.nz version which is cached - this is probably also the reason why the meta description is wrong (it's always coming from the .co.nz version) and why the % of url's indexed for each sitemap (for the .com & .com.au version) is so low.
Try to rigorously eliminate all cross-references in your site - to make it more obvious for Google that these are 3 different sites:
-
in the footer - the links in the second column are pointing to the .co.nz version (latest articles) - change these links to relative ones
-
on all sites there are elements you load from the .com domain (see latest blog entries - the images are loaded from the .com domain for all tld's
As long as you send these confusing signals to Google - Google will mix up the different versions of your site.
rgds,
Dirk
-
-
Hi there Justin
Everything looks fine from here - there are a couple URLs that need to be updated in your sitemap as they are redirecting.
Google takes time to index, so give this a little more time. You could ask Google to recrawl your URLs but that's very unnecessary at the moment; just something to note.
I would make sure your internal links are all good to go and "follow" so that crawlers can at least find URLs that way.
I did a quick site: search on Google, so far you have 58 pages indexed. You should be okay.
Hope this helps! Good luck!
-
Hi Justin,
Similar question asked in this post @ http://moz.com/community/q/webmaster-tools-indexed-pages-vs-sitemap
Hope this helps you.
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl anamoly issue on Search Console
Has anyone checked the crwal anamoly issue under the index section on Search console? We recently move to a new site and I'm seeing a huge list of excluded urls which are classified as crawl anamoly (they all lead to 404 page). Does anyone know that if we need to 301 redirect all the links? Is there any other smarter/ more efficiently way to deal with them like set up canonical link (I thought that's what they're used for isn't it?) Thanks!
White Hat / Black Hat SEO | | greenshinenewenergy0 -
Will pillar posts create a duplication content issue, if we un-gate ebook/guides and use exact copy from blogs?
Hi there! With the rise of pillar posts, I have a question on the duplicate content issue it may present. If we are un-gating ebook/guides and using (at times) exact copy from our blog posts, will this harm our SEO efforts? This would go against the goal of our post and is mission-critical to understand before we implement pillar posts for our clients.
White Hat / Black Hat SEO | | Olivia9540 -
Potential spam issue - back links
Hi - we have a client whom we work with for SEO. During a review we noticed in Webmaster Tools, there was an IP address with over 30,000 links to our clients site. The IP address is 92.60.0.123. From looking up the IP address details, it looks like it is based in Europe - but we are unable to establish what it is, where the links are and who created it. We are concerned it could be a potential spammer trying to cause an issue with the SEO campaign. Is there any way of finding out any more details apart from the basic information about the location of the IP address? Also - if we submit a disavow via webmaster tools, we are unsure what issue it will have on the clients site if we do not know what it is and the type of links it is creating. Any ideas? Thanks for your help! Phil.
White Hat / Black Hat SEO | | Globalgraphics0 -
Partial Sitemaps Impact on SERP
I have a website having 20 different categories. But have the sitemap for only 1 category and rest 19 categories will not have the sitemaps will this have an impact on the search results on not
White Hat / Black Hat SEO | | seosogo0 -
Does IP Blacklist cause SEO issues?
Hi, Our IP was recently blacklisted - we had a malicious script sending out bulk mail in a Joomla installation. Does it hurt our SEO if we have a domain hosted on that IP? Any solid evidence? Thanks.
White Hat / Black Hat SEO | | bjs20100 -
A client/Spam penalty issue
Wondering if I could pick the brains of those with more wisdom than me... Firstly, sorry but unable to give the client's url on this topic. I know that will not help with people giving answers but the client would prefer it if this thread etc didn't appear when people type their name in google. Right, to cut a long story short..gained a new client a few months back, did the usual things when starting the project of reviewing the backlinks using OSE and Majestic. There were a few iffy links but got most of those removed. In the last couple of months have been building backlinks via guest blogging and using bloggerlinkup and myblogguest (and some industry specific directories found using linkprospector tool). All way going well, the client were getting about 2.5k hits a day, on about 13k impressions. Then came the last Google update. The client were hit, but not massively. Seemed to drop from top 3 for a lot of keywords to average position of 5-8, so still first page. The traffic went down after this. All the sites which replaced the client were the big name brands in the niche (home improvement, sites such as BandQ, Homebase, for the fellow UK'ers). This was annoying but understandable. However, on 27th June. We got the following message in WMT - Google has detected a pattern of artificial or unnatural links pointing to your site. Buying links or participating in link schemes in order to manipulate PageRank are violations of Google's Webmaster Guidelines.
White Hat / Black Hat SEO | | GrumpyCarl
As a result, Google has applied a manual spam action to xxxx.co.uk/. There may be other actions on your site or parts of your site. This was a shock to say the least. A few days later the traffic on the site went down more and the impressions dropped to about 10k a day (oddly the rankings seem to be where they were after the Google update so perhaps a delayed message). To get back up to date....after digging around more it appears there are a lot of SENUKE type links to the site - links on poor wiki sites,a lot of blog commenting links, mostly from irrelevant sites, i enclose a couple of examples below. I have broken the links so they don't get any link benefit from this site. They are all safe for work http:// jonnyhetherington. com/2012/02/i-need-a-new-bbq/?replytocom=984 http:// www.acgworld. cn/archives/529/comment-page-3 In addition to this there is a lot of forum spam, links from porn sites and links from sites with Malware warnings. To be honest, it is almost perfect negative seo!! I contacted several of the sites in question (about 450) and requested they remove the links, the vast majority of the sites have no contact on them so I cannot get the links removed. I did a disavow on these links and then a reconsideration request but was told that this is unsuccessful as the site still was being naughty. Given that I can neither remove the links myself or get Google to ignore them, my options for lifting this penalty are limited. What would be the course of action others would take, please. Thanks and sorry for overally long post0 -
SEO best practice: Use tags for SEO purpose? To add or not to add to Sitemap?
Hi Moz community, New to the Moz community and hopefully first post/comment of many to come. I am somewhat new to the industry and have a question that I would like to ask and get your opinions on. It is most likely something that is a very simple answer, but here goes: I have a website that is for a local moving company (so small amounts of traffic and very few pages) that was built on Wordpress... I was told when I first started that I should create tags for some of the cities serviced in the area. I did so and tagged the first blog post to each tag. Turned out to be about 12-15 tags, which in turn created 12-15 additional pages. These tags are listed in the footer area of each page. There are less than 20 pages in the website excluding the tags. Now, I know that each of these pages are showing as duplicate content. To me, this just does not seem like best practices to me. For someone quite new to the industry, what would you suggest I do in order to best deal with this situation. Should I even keep the tags? Should I keep and not index? Should I add/remove from site map? Thanks in advance for any help and I look forward to being a long time member of SEOMoz.
White Hat / Black Hat SEO | | BWrightTLM0 -
Google Sitemaps & punishment for bad URLS?
Hoping y'all have some input here. This is along story, but I'll boil it down: Site X bought the url of Site Y. 301 redirects were added to direct traffic (and help transfer linkjuice) from urls in Site X to relevant urls in Site Y, but 2 days before a "change of address" notice was submitted in Google Webmaster Tools, an auto-generating sitemap somehow applied urls from Site Y to the sitemap of Site X, so essentially the sitemap contained urls that were not the url of Site X. Is there any documentation out there that Google would punish Site X for having essentially unrelated urls in its sitemap by downgrading organic search rankings because it may view that mistake as black hat (or otherwise evil) tactics? I suspect this because the site continues to rank well organically in Yahoo & Bing, yet is nonexistent on Google suddenly. Thoughts?
White Hat / Black Hat SEO | | RUNNERagency0