Wrong canonical URL was specified. How to refresh the index now?
-
Wrong canonical URL was applied to thousands of pages of a client website, pointing them all to a single non-existing URL. Now Google has de-indexed most of those pages. We have fixed the problem now, but do we get Search engines crawl those pages again and start showing in Search results?
I understand that a slow recovery is possible if we don't do anything. Was wondering if we can fast track the recovery... Any pointers?
Thanks
-
Yeah, this is a good starting point. Create a sitemap in GWT with just these pages (it's easier to monitor that way), and re-fetch any specific pages that are critical. You can also build internal links (even temporarily) to kick the crawlers or try promoting some pages via Google+. There's no foolproof method, though - just nudges.
This assumes, of course, that you've corrected the canonical tags. If there shouldn't have been a canonical tag at all, then I'd recommend adding a self-referencing canonical (i.e. one pointing to the page itself). A new canonical tag seems to overwrite an old one better than just removing, at least that's my anecdotal observation.
-
Hi,
You could try fetching the page (homepage I guess unless these pages were all part of a sub section of the site) in webmaster tools which should help speed the process, see this page for details. And of course if not already done make sure you have a valid sitemap in GWT which has all the relevant urls in it. A bit of patience and they should come back.
-
Canonical isn't like a 301, where the page is eventually dropped. Canonical is a hint the page gives to what should win the duplicate content race. It doesn't mean you won't be crawled again, but it might take some time. The key factor here is page popularity. The more popular a page, the faster it gets crawled.
Have you considered a social campaign for the pages in question?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Indexing Issue
Hi, We have moved one of our domain https://www.mycity4kids.com/ in angular js and after that, i observed the major drop in the number of indexed pages. I crosschecked the coding and other important parameters but didn't find any major issue. What could be the reason behind the drop?
Technical SEO | | ResultFirst0 -
Trailing Slashes on URLs
Hi everyone I have a question on trailing slashes in URL. The crux of it is this: is having both: example.com/subdirectory/ and: example.com/subdirectory on all of your subdirectories considered duplicate content by Google - or in some other way really bad? We have done a heck a lot of research into this, and it would seem...no one knows for sure (it is easy to get lost in a sea of Webmaster tool forums from 2012). Google itself has both URLs for it's subdirectories (try https://www.google.co.uk/maps and https://www.google.co.uk/maps/) as does Moz; and yet there are some rumblings on the internet of people who think you must put a 'redirect' (although not really a redirect as it isn't a 301) in your htaccess file to one or the other (so for example.com/subdirectory/ would 'forward' to example.com/subdirectory); and this is what bbc.co.uk do. We tried putting this htaccess 'forward' in as an experiment, but I noticed our site then stopped being fully crawled by Google bot, so we reversed it. Can any one shed any light?
Technical SEO | | NickOrbital0 -
Why put rel=canonical to the same url ?
Hi all. I've heard that it's good to put the link rel canonical in your header even when there is no other important or prefered version of that url. If you take a look at moz.com and see the code, you'll see that they put the <link rel="<a class="attribute-value">canonical</a>" href="http://moz.com" /> ... pointing at the same url ! But if you go to http://moz.com/products/pricing for example, they have no canonical there ! WHY ? Thanks in advance !
Technical SEO | | Tintanus0 -
Friendly URLS (SEO urls)
Hello, I own a eCommerce site with more than 5k of products, urls of products are : www.site.com/index.php?route=product/product&path=61_87&product_id=266 Im thinking about make it friend to seo site.com/category/product-brand Here is my question,will I lost ranks for make that change? Its very important to me know it Thank you very much!
Technical SEO | | matiw0 -
Canonical Issue?
Hi, I was using the On Page Report Card Tool here on SEOMOZ for the following page: http://www.priceline.com/eventi-a-kimpton-hotel-new-york-city-new-york-ny-1614979-hd.hotel-reviews-hotel-guides and it claims there is a canonical issue or improper use of it. I looked at the element and it seems to be fine: <link rel="canonical" href="http://www.priceline.com/eventi-a-kimpton-hotel-new-york-city-new-york-ny-1614979-hd.hotel-reviews-hotel-guides" /> Can you spot the issue and how it would be fixed? Thanks. Eddy
Technical SEO | | workathomecareers0 -
How to find original URLS after Hosting Company added canonical URLs, URL rewrites and duplicate content.
We recently changed hosting companies for our ecommerce website. The hosting company added some functionality such that duplicate content and/or mirrored pages appear in the search engines. To fix this problem, the hosting company created both canonical URLs and URL rewrites. Now, we have page A (which is the original page with all the link juice) and page B (which is the new page with no link juice or SEO value). Both pages have the same content, with different URLs. I understand that a canonical URL is the way to tell the search engines which page is the preferred page in cases of duplicate content and mirrored pages. I also understand that canonical URLs tell the search engine that page B is a copy of page A, but page A is the preferred page to index. The problem we now face is that the hosting company made page A a copy of page B, rather than the other way around. But page A is the original page with the seo value and link juice, while page B is the new page with no value. As a result, the search engines are now prioritizing the newly created page over the original one. I believe the solution is to reverse this and make it so that page B (the new page) is a copy of page A (the original page). Now, I would simply need to put the original URL as the canonical URL for the duplicate pages. The problem is, with all the rewrites and changes in functionality, I no longer know which URLs have the backlinks that are creating this SEO value. I figure if I can find the back links to the original page, then I can find out the original web address of the original pages. My question is, how can I search for back links on the web in such a way that I can figure out the URL that all of these back links are pointing to in order to make that URL the canonical URL for all the new, duplicate pages.
Technical SEO | | CABLES0 -
HTML url extension
I've read some information about the extension of an url. But i couldn't find a clear answer. What is better for SEO, an extension with html or without? /make-money-online/how-to-make-a-million-dollars-in-1-year/ or /make-money-online/how-to-make-a-million-dollars-in-1-year.html/ Is there a difference between a normal website or a blog?
Technical SEO | | PlusPort0 -
Site not being Indexed that fast anymore, Is something wrong with this Robots.txt
My wordpress site's robots.txt used to be this: User-agent: * Disallow: Sitemap: http://www.domainame.com/sitemap.xml.gz I also have all in one SEO installed and other than posts, tags are also index,follow on my site. My new posts used to appear on google in seconds after publishing. I changed the robots.txt to following and now post indexing takes hours. Is there something wrong with this robots.txt? User-agent: * Disallow: /cgi-bin Disallow: /wp-admin Disallow: /wp-includes Disallow: /wp-content/plugins Disallow: /wp-content/cache Disallow: /wp-content/themes Disallow: /wp-login.php Disallow: /wp-login.php Disallow: /trackback Disallow: /feed Disallow: /comments Disallow: /author Disallow: /category Disallow: */trackback Disallow: */feed Disallow: */comments Disallow: /login/ Disallow: /wget/ Disallow: /httpd/ Disallow: /*.php$ Disallow: /? Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /? Disallow: /*?Allow: /wp-content/uploads User-agent: TechnoratiBot/8.1 Disallow: ia_archiverUser-agent: ia_archiver Disallow: / disable duggmirror User-agent: duggmirror Disallow: / allow google image bot to search all imagesUser-agent: Googlebot-Image Disallow: /wp-includes/ Allow: /* # allow adsense bot on entire siteUser-agent: Mediapartners-Google* Disallow: Allow: /* Sitemap: http://www.domainname.com/sitemap.xml.gz
Technical SEO | | ideas1230