Sitemap url's not being indexed
-
There is an issue on one of our sites regarding many of the sitemap url's not being indexed. (at least 70% is not being indexed)
The url's in the sitemap are normal url's without any strange characters attached to them, but after looking into it, it seems a lot of the url's get a #. + a number sequence attached to them once you actually go to that url. We are not sure if the "addthis" bookmark could cause this, or if it's another script doing it.
For example
Url in the sitemap: http://example.com/example-category/0246
Url once you actually go to that link: http://example.com/example-category/0246#.VR5a
Just for further information, the XML file does not have any style information associated with it and is in it's most basic form.
Has anyone had similar issues with their sitemap not being indexed properly ?...Could this be the cause of many of these url's not being indexed ?
Thanks all for your help.
-
Anders,
Thanks for the reply. I definitely agree a self referring canonical might just be a good extra addition on these product pages, so I'm definitely adding that to our list of to do's if it does not improve.
In terms of indexing pages - We have not restricted crawl frequency, we have it set to "allow google to determine the optimal crawl rate". No other warnings found within the search console either.
Thanks for your help.
-
I agree - i probably would ignore everything after the "#".
But have you tried added a <link rel="canonical" href="http://example.com/page-url" /> to your pages and see if this will update it? Also: Add the sitemap to your robots.txt if not allready done.
Regarding indexing pages - have you restricted crawl frequency in Google Search Console, or is it set to be determined by GoogleBot? Any other warnings or messages in Search Console?
Best regards,
Anders -
Lesley,
Thanks for the confirmation on that one and the article. Since it doesn't seem like a lot of people on the site are using that address share function, I do not think it would do any harm to remove it.
At least we know the root cause of why it's doing it to the url's. Now the real question is...could it be getting in the way of indexing those url's ?...one would think not, as from what I've read, google would simply ignore what comes after the #.
Thoughts ?
Appreciate the help.
-
Patrick,
We'd prefer to keep the actual url's private, however I can provide further information to help hopefully allow the community to dissect this further:
- It's an E-commerce website, meaning many facets, filters, and possible duplicate content angles
- It seems many of the static pages (/products main page, /contact,etc) are indexed, however it seems the individual products are mostly not being indexed through the sitemap
- While the url's found in webmaster tools under "index" has also steadily been going down, it definitely doesn't correspond with the lack of pages indexed vs submitted within the sitemap
- We have checked robots.txt, and it is not blocking any important pages. (I also had them allow robots to crawl css and js so google could have full access)
- The individual product pages all have the "addthis" feature, meaning they all have a #. + number sequence added to the url's. However one would think this wouldn't be the cause of this lack of indexation ?
Thanks for your help.
-
Yes, add this is doing this to your url. I hate it, that is one reason why I do not use them.
Here is an article on how to remove them, http://support.addthis.com/customer/portal/articles/1013558-removing-all-hashtags-anchors-weird-codes-from-your-urls
-
Hi there
Could you provide you website's URL? It would help the community take a deeper look - thanks!
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If I'm using a compressed sitemap (sitemap.xml.gz) that's the URL that gets submitted to webmaster tools, correct?
I just want to verify that if a compressed sitemap file is being used, then the URL that gets submitted to Google, Bing, etc and the URL that's used in the robots.txt indicates that it's a compressed file. For example, "sitemap.xml.gz" -- thanks!
Technical SEO | | jgresalfi0 -
Some of my website urls are not getting indexed while checking (site: domain) in google
Some of my website urls are not getting indexed while checking (site: domain) in google
Technical SEO | | nlogix0 -
Duplicate Content - What's the best bad idea?
Hi all, I have 1000s of products where the product description is very technical and extremely hard to rewrite or create an unique one. I'll probably will have to use the contend provided by the brands, which can already be found in dozens of other sites. My options are: Use the Google on/off tags "don't index
Technical SEO | | Carlos-R
" Put the content in an image Are there any other options? We'd always write our own unique copy to go with the technical bit. Cheers0 -
Why would an image that's much smaller than our Website header logo be taking so long to load?
When I check http://www.ccisolutions.com at Pingdom, we have a tiny graphic that is taking much longer to load than other graphics that are much bigger. Can anyone shed some light on why this might be happening and what can be done to fix it? Thanks in advance! Dana
Technical SEO | | danatanseo0 -
How to get out of Google's sendbox
Hello, i posted this question before here in forum, that 2 of my pages were sendboxed but never had a clear answer on how to get them back up, i do know that i need to build high quality backlinks pointing to those pages, but where do i start? Thanks
Technical SEO | | tonyklu0 -
Possible penguin hit but then back, now what's next?
hiz, i did a little check on my site by answering the quiz at mytrafficdropped.com and there was a question about on what dates there was drop in organic. and i did checked my analytics on a top sending keyword. here is what i found. see attached image . Traffic dropped totally on April 20 to onwards. Then got back better in june, but again dropped in October, still down.. anythoughts guys ? 1Jk47.png
Technical SEO | | wickedsunny10 -
We have a decent keyword rich URL domain that's not being used - what to do with it?
We're an ecommerce site and we have a second, older domain with a better keyword match URL than our main domain (I know, you may be wondering why we didn't use it, but that's beside the point now). It currently ranks fairly poorly as there's very few links pointing to it. However, the exact match URL means it has some value, if we were to build a few links to it. What would you do with it: 301 product/category pages to current site's equivalent page Link product/category pages to current site's equivalent page Not bother using it at all Something else
Technical SEO | | seanmccauley0 -
Do any short url's pass link juice? googles own? twitters?
I've read a few posts saying not shorten links at all but we have a lot to tweet and need to. Is googles shortener the best option? I've considered linking to the category index page the article is on and expect the user to find the article and click on the article, I don't like the experience that creates though. I've considered making the article permalink tiny but I would lose the page title being in the url. Is this the best option?
Technical SEO | | Aviawest0