Sitemap url's not being indexed
-
There is an issue on one of our sites regarding many of the sitemap url's not being indexed. (at least 70% is not being indexed)
The url's in the sitemap are normal url's without any strange characters attached to them, but after looking into it, it seems a lot of the url's get a #. + a number sequence attached to them once you actually go to that url. We are not sure if the "addthis" bookmark could cause this, or if it's another script doing it.
For example
Url in the sitemap: http://example.com/example-category/0246
Url once you actually go to that link: http://example.com/example-category/0246#.VR5a
Just for further information, the XML file does not have any style information associated with it and is in it's most basic form.
Has anyone had similar issues with their sitemap not being indexed properly ?...Could this be the cause of many of these url's not being indexed ?
Thanks all for your help.
-
Anders,
Thanks for the reply. I definitely agree a self referring canonical might just be a good extra addition on these product pages, so I'm definitely adding that to our list of to do's if it does not improve.
In terms of indexing pages - We have not restricted crawl frequency, we have it set to "allow google to determine the optimal crawl rate". No other warnings found within the search console either.
Thanks for your help.
-
I agree - i probably would ignore everything after the "#".
But have you tried added a <link rel="canonical" href="http://example.com/page-url" /> to your pages and see if this will update it? Also: Add the sitemap to your robots.txt if not allready done.
Regarding indexing pages - have you restricted crawl frequency in Google Search Console, or is it set to be determined by GoogleBot? Any other warnings or messages in Search Console?
Best regards,
Anders -
Lesley,
Thanks for the confirmation on that one and the article. Since it doesn't seem like a lot of people on the site are using that address share function, I do not think it would do any harm to remove it.
At least we know the root cause of why it's doing it to the url's. Now the real question is...could it be getting in the way of indexing those url's ?...one would think not, as from what I've read, google would simply ignore what comes after the #.
Thoughts ?
Appreciate the help.
-
Patrick,
We'd prefer to keep the actual url's private, however I can provide further information to help hopefully allow the community to dissect this further:
- It's an E-commerce website, meaning many facets, filters, and possible duplicate content angles
- It seems many of the static pages (/products main page, /contact,etc) are indexed, however it seems the individual products are mostly not being indexed through the sitemap
- While the url's found in webmaster tools under "index" has also steadily been going down, it definitely doesn't correspond with the lack of pages indexed vs submitted within the sitemap
- We have checked robots.txt, and it is not blocking any important pages. (I also had them allow robots to crawl css and js so google could have full access)
- The individual product pages all have the "addthis" feature, meaning they all have a #. + number sequence added to the url's. However one would think this wouldn't be the cause of this lack of indexation ?
Thanks for your help.
-
Yes, add this is doing this to your url. I hate it, that is one reason why I do not use them.
Here is an article on how to remove them, http://support.addthis.com/customer/portal/articles/1013558-removing-all-hashtags-anchors-weird-codes-from-your-urls
-
Hi there
Could you provide you website's URL? It would help the community take a deeper look - thanks!
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Generation 'child' sitemaps?
First off, am I correct in thinking that a 'child' sitemap is a sitemap of a subfolder and everything that sits under it, i.e. www.example.com/example If so, can someone give me a good recommendation for generation a free child sitemap please? Many thanks, Rhys
Technical SEO | | SwanseaMedicine0 -
Blog page won't get indexed
Hi Guys, I'm currently asked to work on a website. I noticed that the blog posts won't get indexed in Google. www.domain.com/blog does get indexed but the blogposts itself won't. They have been online for over 2 months now. I found this in the robots.txt file: Allow: / Disallow: /kitchenhandle/ Disallow: /blog/comments/ Disallow: /blog/author/ Disallow: /blog/homepage/feed/ I'm guessing that the last line causes this issue. Does anyone have an idea if this is the case and why they would include this in the robots.txt? Cheers!
Technical SEO | | Happy-SEO2 -
Need advice for new site's structure
Hi everyone, I need to update the structure of my site www.chedonna.it Basicly I've two main problems: 1. I've 61.000 index tag (more with no post)2. The category of my site are noindex I thought to fix my problem making the category index and the tag noindex, but I'm not sure if this is the best solution because I've a great number of tag idexed by Google for a long time. Mybe it is correct just to make the category index and linking it from the post and leave the tag index. Could you please let me know what's your opinion? Regards.
Technical SEO | | salvyy0 -
Sitemap nos being indexed
Hi! How are you? I'm having a problem: for some reason I don't understand, Google Webmasters Tool isn't indexing the sitemaps I'm uploading. One of them is http://chelagarto.com/index.php?option=com_xmap&sitemap=1&view=xml&lang=en . Do you see what could be the problem? It says it only indexed 2 website. I've already sent this Sitemap several times and I'm always getting the same result. I'd really use some advice. Thanks!
Technical SEO | | arielbortz0 -
Will syndicated content hurt a website's ranking potential?
I work with a number of independent insurance agencies across the United States. All of these agencies have setup their websites through one preferred insurance provider. The websites are customizable to a point, but the content for the entire website is mostly the same. Therefore, literally hundreds of agency sites have essentially the same content. The only thing that changes is a few "wildcards" in the copy where the agency fills in their city, state, services areas, company history, etc. My questions is: will this syndicated content hurt their ranking potential? I've been toying with the idea of further editing the content to make it more unique to an agency, but I would hate to waste a lot of hours doing this if it won't help anything. Would you expect this approach to be beneficial or a waste of time? Thank you for your help!
Technical SEO | | copyjack0 -
If multiple links on a page point to the same URL, and one of them is no-followed, does that impact the one that isn't?
Page A has two links on it that both point to Page B. Link 1 isn't no-follow, but Link 2 is. Will Page A pass any juice to Page B?
Technical SEO | | Jay.Neely0 -
Pictures 'being stolen'
Helping my wife with ecommerce site. Selling clothes. Some photos are given by producer, but at times they are not too good. Some are therefore taking their own photos and i suspect ppl are copying them and using them on their own site. Is there anyting to do about this - watermarking of course, but can they be 'marked' in anyway linking to your site ?
Technical SEO | | danlae0 -
Getting a Video Sitemap Indexed
Hi, A client of mine completed a video sitemap to Google Webmaster Tools a couple of months ago. As of yet the videos are still not indexing in Google. All of the videos sit on the one page but have unique URLs in the sitemap. Does anybody know a reason why they are not being indexed? Thanks David
Technical SEO | | RadicalMedia0