Sitemap url's not being indexed
-
There is an issue on one of our sites regarding many of the sitemap url's not being indexed. (at least 70% is not being indexed)
The url's in the sitemap are normal url's without any strange characters attached to them, but after looking into it, it seems a lot of the url's get a #. + a number sequence attached to them once you actually go to that url. We are not sure if the "addthis" bookmark could cause this, or if it's another script doing it.
For example
Url in the sitemap: http://example.com/example-category/0246
Url once you actually go to that link: http://example.com/example-category/0246#.VR5a
Just for further information, the XML file does not have any style information associated with it and is in it's most basic form.
Has anyone had similar issues with their sitemap not being indexed properly ?...Could this be the cause of many of these url's not being indexed ?
Thanks all for your help.
-
Anders,
Thanks for the reply. I definitely agree a self referring canonical might just be a good extra addition on these product pages, so I'm definitely adding that to our list of to do's if it does not improve.
In terms of indexing pages - We have not restricted crawl frequency, we have it set to "allow google to determine the optimal crawl rate". No other warnings found within the search console either.
Thanks for your help.
-
I agree - i probably would ignore everything after the "#".
But have you tried added a <link rel="canonical" href="http://example.com/page-url" /> to your pages and see if this will update it? Also: Add the sitemap to your robots.txt if not allready done.
Regarding indexing pages - have you restricted crawl frequency in Google Search Console, or is it set to be determined by GoogleBot? Any other warnings or messages in Search Console?
Best regards,
Anders -
Lesley,
Thanks for the confirmation on that one and the article. Since it doesn't seem like a lot of people on the site are using that address share function, I do not think it would do any harm to remove it.
At least we know the root cause of why it's doing it to the url's. Now the real question is...could it be getting in the way of indexing those url's ?...one would think not, as from what I've read, google would simply ignore what comes after the #.
Thoughts ?
Appreciate the help.
-
Patrick,
We'd prefer to keep the actual url's private, however I can provide further information to help hopefully allow the community to dissect this further:
- It's an E-commerce website, meaning many facets, filters, and possible duplicate content angles
- It seems many of the static pages (/products main page, /contact,etc) are indexed, however it seems the individual products are mostly not being indexed through the sitemap
- While the url's found in webmaster tools under "index" has also steadily been going down, it definitely doesn't correspond with the lack of pages indexed vs submitted within the sitemap
- We have checked robots.txt, and it is not blocking any important pages. (I also had them allow robots to crawl css and js so google could have full access)
- The individual product pages all have the "addthis" feature, meaning they all have a #. + number sequence added to the url's. However one would think this wouldn't be the cause of this lack of indexation ?
Thanks for your help.
-
Yes, add this is doing this to your url. I hate it, that is one reason why I do not use them.
Here is an article on how to remove them, http://support.addthis.com/customer/portal/articles/1013558-removing-all-hashtags-anchors-weird-codes-from-your-urls
-
Hi there
Could you provide you website's URL? It would help the community take a deeper look - thanks!
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
NoIndex tag, canonical tag or automatically generated H1's for automatically generated enquiry pages?
What would be better for automatically generated accommodation enquiry pages for a travel company? NoIndex tag, canonical tag, automatically generated H1's or another solution? This is the homepage: https://www.discoverqueensland.com.au/ You would enquire from a page like this: https://www.discoverqueensland.com.au/accommodation/sunshine-coast/twin-waters/the-sebel-twin-waters This is the enquiry form: https://www.discoverqueensland.com.au/accommodation-enquiry.php?name=The+Sebel+Twin+Waters®ion_name=Sunshine+Coast
Technical SEO | | Kim_Lazaro0 -
Dealing with broken internal links/404s. What's best practice?
I've just started working on a website that has generated lots (100s) of broken internal links. Essentially specific pages have been removed over time and nobody has been keeping an eye on what internal links might have been affected. Most of these are internal links that are embedded in content which hasn't been updated following the page's deletion. What's my best way to approach fixing these broken links? My plan is currently to redirect where appropriate (from a specific service page that doesn't exist to the overall service category maybe?) but there are lots of pages that don't have a similar or equivalent page. I presume I'll need to go through the content removing the links or replacing them where possible. My example is a specific staff member who no longer works there and is linked to from a category page, should i be redirecting from the old staff member and updating the anchor text, or just straight up replacing the whole thing to link to the right person? In most cases, these pages don't rank and I can't think of many that have any external websites linking to them. I'm over thinking all of this? Please help! 🙂
Technical SEO | | Adam_SEO_Learning0 -
Ecommerce website: Product page setup & SKU's
I manage an E-commerce website and we are looking to make some changes to our product pages to try and optimise them for search purposes and to try and improve the customer buying experience. This is where my head starts to hurt! Now, let's say I am selling a T shirt that comes in 4 sizes and 6 different colours. At the moment my website would have 24 products, each with pretty much the same content (maybe differing references to the colour & size). My idea is to change this and have 1 main product page for the T-shirt, but to have 24 product SKU's/variations that exist to give the exact product details. Some different ways I have been considering to do this: a) have drop-down fields on the product page that ask the customer to select their Tshirt size and colour. The image & price then changes on the page. b) All product 24 product SKUs sre listed under the main product with the 'Add to Cart' open next to each one. Each one would be clickable so a page it its own right. Would I need to set up a canonical links for each SKU that point to the top level product page? I'm obviously looking to minimise duplicate content but Im not exactly sure on how to set this up - its a big decision so I need to be 100% clear before signing off on anything. . Any other tips on how to do this or examples of good e-commerce websites that use product SKus well? Kind regards Tom
Technical SEO | | DHS_SH0 -
Product landing page URL's for e-commerce sites - best practices?
Hi all I have built many e-commerce websites over the years and with each one, I learn something new and apply to the next site and so on. Lets call it continuous review and improvement! I have always structured my URL's to the product landing pages as such: mydomain.com/top-category => mydomain.com/top-category/sub-category => mydomain.com/top-category/sub-category/product-name Now this has always worked fine for me but I see more an more of the following happening: mydomain.com/top-category => mydomain.com/top-category/sub-category => mydomain.com/product-name Now I have read many believe that the longer the URL, the less SEO impact it may have and other comments saying it is better to have the just the product URL on the final page and leave out the categories for one reason or another. I could probably spend days looking around the internet for peoples opinions so I thought I would ask on SEOmoz and see what other people tend to use and maybe establish the reasons for your choices? One of the main reasons I include the categories within my final URL to the product is simply to detect if a product name exists in multiple categories on the site - I need to show the correct product to the user. I have built sites which actually have the same product name (created by the author) in multiple areas of the site but they are actually different products, not duplicate content. I therefore cannot see a way around not having the categories in the URL to help detect which product we want to show to the user. Any thoughts?
Technical SEO | | yousayjump0 -
Replacing H1's with images
We host a few Japanese sites and Japanese fonts tend to look a bit scruffy the larger they are. I was wondering if image replacement for H1 is risky or not? eg in short... spiders see: Some header text optimized for seo then in the css h1 {
Technical SEO | | -Al-
text-indent: -9999px;
} h1.header_1{ background:url(/images/bg_h1.jpg) no-repeat 0 0; } We are considering this technique, I thought I should get some advise before potentially jeopardising anything, especially as we are dealing with one of the most important on page elements. In my opinion any attempt to hide text could be seen as keyword stuffing, is it a case that in moderation it is acceptable? Cheers0 -
Sitemap coming up in Google's index?
I apologize if this question's answer is glaringly obvious, but I was using Google to view all the pages it has indexed of our site--by searching for our company and then clicking the link that says to display more results for the site. On page three, it has the sitemap indexed as if it wee just another page of our site. <cite>www.stadriemblems.com/sitemap.xml</cite> Is this supposed to happen?
Technical SEO | | UnderRugSwept0 -
What's the best format for a e-commerce URL product page
We have over 2000 non branded experiences and activities sold through our website. The website is having a face lift with the a new look and a stronger focus on SEO. As part of this, I am keen to establish what the best practice is for product based URLs. I've researched the market and come up with a few alternatives that are used: domain/category/subcategory/activity_name domain/activity_name/category/subcategory/activity_reference domain/generic_term/activity_reference/activity_name domain/category/activity_location/activity_name Activities are location based but the location can change (say once every 2 years). Activity names, category, subcategory and activity_reference rarely change. Are there any thoughts/ research on the best method? (If there is one) Many thanks in advance for your insights.
Technical SEO | | philwill0 -
Rel cannonical on all my URL's
Hi, sorry if this question has already been asked, but I can't seem to find the correct answer. In my crawling report for the domain: http://www.wellbo.de I get rel cannonical notices. I have redirected all pages of http://wellbo.de to http://www.wellbo.de with a 301 redirect. Where is my error? Why do I get these notices? I hope the image helps. Ep7Rw.jpg
Technical SEO | | wellbo0