Why is a canonicalized URL still in index?
-
Hi Mozers,
We recently canonicalized a few thousand URLs but when I search for these pages using the site: operator I can see that they are all still in Google's index. Why is that?
Is it reasonable to expect that they would be taken out of the index? Or should we only expect that they won't rank as high as the canonical URLs?
Thanks!
-
Also helpful, thanks! Yael
-
Really helpful thanks! Yael
-
Hi yaelslater,
I've seen an experiment about this, and Google will still index the canonicalized urls, but in theory, it will show to a related term search the one the canonicals are pointing to.
The first 2 urls of this search are the ones of the experiment:
You can see in the code the canonical.
The experiment is from 2013! so don't expect those urls to be taken out from the index. If you are looking for this, you should use other techniques like robots.txt to block indexation or redirects 301 to permanently redirect.
Greetings!
-
Hi there!
Canonical tags aren´t that immediate and automatic as its considered.
This happens because google needs to recrawl all your site, more precisely, every of that URL that have been canonicalized.
In my experience with XXL sites when canonicalizing, it takes arround 6 weeks to move arround 60% of the URLs. This checked with Traffic in Search console (most of the cases were when moving to https in the last 12 months)
Also, there are some little to nothing in traffic from those already canonicalized pages, 12 month ago!Yes its reasonable to expect that all those canonicalized URLs to be changed for the correct URL. Even, in some cases when searching for that exact URL the result shows the canonicalized one.
The thing about ranking worse.. its not that simple, because you are telling google to switch URLs.Hope it helps.
Best luck.
GR
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Attack of the dummy urls -- what to do?
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website: www.mysite.com/dynamicpage/dummy123 www.mysite.com/dynamicpage/dummy456 etc.. How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors. I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
Intermediate & Advanced SEO | | friendoffood1 -
Which URLs were indexed 2 years ago?
Hi, I hope anyone can help me with this issue. Our french domain experienced a huge drop of indexed URLs in 2012. More than 50k URLs were indexed, after the drop less than 10k were counted. I would like to check what happened here and which URLs were thrown out of the index. So I was thinking about a comparison between todays data and the data of 2012. Unfortunately we don't have any data on the indexed pages in 2012 beside the number of indexed pages. Is there any way to check, which URLs were indexed 2 years ago?
Intermediate & Advanced SEO | | Sandra_h0 -
HTTPS Certificate Expired. Website with https urls now still in index issue.
Hi Guys This week the Security certificate of our website expired and basically we now have to wail till next Tuesday for it to be re-instated. So now obviously our website is now index with the https urls, and we had to drop the https from our site, so that people will not be faced with a security risk screen, which most browsers give you, to ask if you are sure that you want to visit the site, because it's seeing it as an untrusted one. So now we are basically sitting with the site urls, only being www... My question what should we do, in order to prevent google from penalizing us, since obviously if googlebot comes to crawl these urls, there will be nothing. I did however re-submitted it to Google to crawl it, but I guess it's going to take time, before Google picks up that now only want the www urls in the index. Can somebody please give me some advice on this. Thanks Dave
Intermediate & Advanced SEO | | daveza0 -
Should eCommerce Canonicalize to CMS
We have inherited a site that has a Joomla CMS "showroom" front-end and a Magento "store room" for check out etc. Question - As the site's main pages are in the CMS section should we: make all Magento product pages canonical to the main sections/product pages within the CMS (even though there are no duplicate content issues) "No index" the product pages Index but indicate low page value in sitemap Do something else? 🙂 Thanks for any and all input!
Intermediate & Advanced SEO | | TheNorthernOffice790 -
Drop in indexed pages!
Hi everybody! I've been working on http://thewilddeckcompany.co.uk/ for a little while now. Until recently, everything was great - good rankings for the key terms of 'bird hides' and 'pond dipping platforms'. However, rankings have tanked over the past few days. I can't point my finger at it yet, but a site:thewilddeckcompany.co.uk search shows only three pages have been indexed. There's only 10 on the site, and it was fine beforehand. Any advice would be much appreciated,
Intermediate & Advanced SEO | | Blink-SEO0 -
To index search results or to not index search results?
What are your feelings about indexing search results? I know big brands can get away with it (yelp, ebay, etc). Apart from UGC, it seems like one of the best ways to capture long tail traffic at scale. If the search results offer valuable / engaging content, would you give it a go?
Intermediate & Advanced SEO | | nicole.healthline0 -
To index or not to index search pages - (Panda related)
Hi Mozzers I have a WordPress site with Relevanssi the search engine plugin, free version. Questions: Should I let Google index my site's SERPS? I am scared the page quality is to thin, and then Panda bear will get angry. This plugin (or my previous search engine plugin) created many of these "no-results" uris: /?s=no-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Akids+wall&cat=no-results&pg=6 I have added a robots.txt rule to disallow these pages and did a GWT URL removal request. But links to these pages are still being displayed in Google's SERPS under "repeat the search with the omitted results included" results. So will this affect me negatively or are these results harmless? What exactly is an omitted result? As I understand it is that Google found a link to a page they but can't display it because I block GoogleBot. Thanx in advance guys.
Intermediate & Advanced SEO | | ClassifiedsKing0 -
How to deal with old, indexed hashbang URLs?
I inherited a site that used to be in Flash and used hashbang URLs (i.e. www.example.com/#!page-name-here). We're now off of Flash and have a "normal" URL structure that looks something like this: www.example.com/page-name-here Here's the problem: Google still has thousands of the old hashbang (#!) URLs in its index. These URLs still work because the web server doesn't actually read anything that comes after the hash. So, when the web server sees this URL www.example.com/#!page-name-here, it basically renders this page www.example.com/# while keeping the full URL structure intact (www.example.com/#!page-name-here). Hopefully, that makes sense. So, in Google you'll see this URL indexed (www.example.com/#!page-name-here), but if you click it you essentially are taken to our homepage content (even though the URL isn't exactly the canonical homepage URL...which s/b www.example.com/). My big fear here is a duplicate content penalty for our homepage. Essentially, I'm afraid that Google is seeing thousands of versions of our homepage. Even though the hashbang URLs are different, the content (ie. title, meta descrip, page content) is exactly the same for all of them. Obviously, this is a typical SEO no-no. And, I've recently seen the homepage drop like a rock for a search of our brand name which has ranked #1 for months. Now, admittedly we've made a bunch of changes during this whole site migration, but this #! URL problem just bothers me. I think it could be a major cause of our homepage tanking for brand queries. So, why not just 301 redirect all of the #! URLs? Well, the server won't accept traditional 301s for the #! URLs because the # seems to screw everything up (server doesn't acknowledge what comes after the #). I "think" our only option here is to try and add some 301 redirects via Javascript. Yeah, I know that spiders have a love/hate (well, mostly hate) relationship w/ Javascript, but I think that's our only resort.....unless, someone here has a better way? If you've dealt with hashbang URLs before, I'd LOVE to hear your advice on how to deal w/ this issue. Best, -G
Intermediate & Advanced SEO | | Celts180