Any SEO-wizards out there who can tell me why Google isn't following the canonicals on some pages?
-
Hi,
I am banging my head against the wall regarding the website of a costumer: In "duplicate title tags" in GSC I can see that Google is indexing a whole bunch parametres of many of the url's on the page. When I check the rel=canonical tag, everything seems correct. My costumer is the biggest sports retailer in Norway. Their webshop has approximately 20 000 products. Yet they have more than 400 000 pages indexed by Google.
So why is Google indexing pages like this? What is missing in this canonical?https://www.gsport.no/herre/klaer/bukse-shorts?type-bukser-334=regnbukser&order=price&dir=descWhy isn't Google just cutting off the ?type-bukser-334=regnbukser&order=price&dir=desc part of the url?Can it be the canonical-tag itself, or could the problem be somewhere in the CMS?
Looking forward to your answers
- Sigurd
-
Thank you all! I have forwarded this to the owner of the page, so now we'll just sit back and see the effects
-
Hi Inevo,
David and Jake's comments and recommendations are spot on correct. You need to update your robots.txt file. Jake is correct when he said "just because a canonical tag is in place, that doesn't prevent Google from crawling and indexing the page."
Sincerely,
Dana
-
Hi Inevo,
Canonical tags are being used correctly and it doesn't actually look like any of the URLs with query strings are indexed in Google.
I'm going to go off the topic of canonicals now, but still related to the crawl and index of the site:
Has the site changed CMS in the last year or two? It's possible that some of the 400k URLs indexed are old or were not canonicalized properly at some point in time, so they were indexed.
The problem with how the site it currently setup is that it is basically impossible for search engines to crawl because of the product filter. I wrote an article about this a while ago (link), specifically to do with product filters in Magento. Product filters can turn your site into a 'black hole' for search engines - which is definitely happening in this case (try crawling it with Screaming Frog).
I'd recommend blocking product filter URLs from being crawled so that search engines are only crawling important pages on the site.
You should be able to fix this be adding these 3 lines to your Robots.txt:
Disallow: *?
Disallow: *+
Allow: *?p=(Note: please check that you don't need to add more parameters to Allow)
These changes will make crawling your site much more efficient - from millions of crawlable URLs, to probably 30-35k.
Let me know how this goes for you
Cheers,
David
-
I would definitely check to make sure the canonical tag is being properly used. Make sure it is an absolute url vs. a relative url.
That being said, please note that just because a canonical tag is in place, that doesn't prevent Google from crawling and indexing the page, and including the page in search results with the site:domain command. If you see the canonicalized URLs outranking their canonical, then you can start to question why Google isn't honoring the canonical.
Please note that canonical tags are a recommendation and not a directive.. meaning Google doesn't have to honor them if they do not feel the page is truly a canonical.
-Jake
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Webmaster tools Hentry showing pages that don't exist
In Webmaster Tools I have a ton of pages listed under Structured Data >> Hentry. These pages are not on my website and I don't know where they are coming from. I redid the site for someone and perhaps they are from the old site. How do I find and delete these? Thank you Rena
Technical SEO | | renalynd270 -
Page Title Displaying differently on Google
Hi i am wondering why page title is display differently on google search. The combination of words that are being displayed on google are not on the page and the page title is simply "Camera Filters", however the page is showing as "our range of Camera Filters" (with the same capitalisation). http://awesomescreenshot.com/0cf4r09y27 I have optimised the age as best i can so it removes the OUR RANGE OF preface, however no luck. Any info would be appreciated. Cheers
Technical SEO | | andrewlos0 -
Why isn't my homepage number #1 when searching my brand name?
Hi! So we recently (a month ago) lunched a new website, we have great content that updates everyday, we're active on social platforms, and we did all that's possible, at the moment, when it comes to on site optimization (a web developer will join our team this month and help us fix all the rest). When I search for our brand name all our social profiles come up first, after them we have a few inner pages from our different news sections, but our homepage is somewhere in the 2nd search page... What may be the reason for that? Is it just a matter of time or is there a problem with our homepage I'm unable to find? Thanks!
Technical SEO | | Orly-PP0 -
While SEOMoz currently can tell us the number of linking c-blocks, can SEOMoz tell us what the specific c-blocks are?
I know it is important to have a diverse set of c-blocks, but I don't know how it is possible to have a diverse set if I can't find out what the c-blocks are in the first place. Also, is there a standard for domain linking c-blocks? For instance, I'm not sure if a certain amount is considered "average" or "above-average."
Technical SEO | | Todd_Kendrick0 -
What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
I'm working on a recently hacked site for a client and and in trying to identify how exactly the hack is running I need to use the fetch as Google bot feature in GWT. I'd love to use this but it thinks the robots.txt is blocking it's acces but the only thing in the robots.txt file is a link to the sitemap. Unde the Blocked URLs section of the GWT it shows that the robots.txt was last downloaded yesterday but it's incorrect information. Is there a way to force Google to look again?
Technical SEO | | DotCar0 -
How to get Google to index another page
Hi, I will try to make my question clear, although it is a bit complex. For my site the most important keyword is "Insurance" or at least the danish variation of this. My problem is that Google are'nt indexing my frontpage on this, but are indexing a subpage - www.mydomain.dk/insurance instead of www.mydomain.dk. My link bulding will be to subpages and to my main domain, but i wont be able to get that many links to www.mydomain.dk/insurance. So im interested in making my frontpage the page that is my main page for the keyword insurance, but without just blowing the traffic im getting from the subpage at the moment. Is there any solutions to do this? Thanks in advance.
Technical SEO | | Petersen110 -
Rel canonical = can it hurt your SEO
I have a site that has been developed to default to the non-www version. However each page has a rel canonical to the non-www version too. Could having this in place on all pages hurt the site in terms of search engines? thanks Steve
Technical SEO | | stevecounsell0 -
How can I get Google to crawl my site daily?
I was wndering if there was a trick to getting google to crawl my website daily?
Technical SEO | | labradoodlelocator0