If Google's index contains multiple URLs for my homepage, does that mean the canonical tag is not working?
-
I have a site which is using canonical tags on all pages, however not all duplicate versions of the homepage are 301'd due to a limitation in the hosting platform. So some site visitors get www.example.com/default.aspx while others just get www.example.com. I can see the correct canonical tag on the source code of both versions of this homepage, but when I search Google for the specific URL "www.example.com/default.aspx" I see that they've indexed that specific URL as well as the "clean" one. Is this a concern... shouldn't Google only show me the clean URL?
-
In most cases, Google does seem to "de-index" the non-canonical URL, if they process they tag. I put in quotes just because, technically, the page is still in Google's index - as soon as it's not showing up at all (including with "site:"), though, I essentially consider that to be de-indexed. If we can't see it, it might as well not be there.
If 301-ing isn't an option, I'd double-check a few things:
(1) Is the non-canonical page ranking for anything (including very long-tail terms)?
(2) Are there any internal links to the non-canonical URL? These can send a strongly mixed signal.
(3) Are there any other mixed signals that might be throwing off the canonical? Examples include canonicals on other pages that contradict this one, 301s/302s that override the canonical, etc.
-
As Digital-Diameter said, the best choice for fixing this problem is a 301. A Canonical tag can eventually lead to the incorrect URL being replaced by the correct one in the SERPs but it is also important to note that the Rel=canonical tag is a suggestion, not a directive. What this means is that the search engines will take it into consideration but may choose not to follow it.
-
Technically, rel=canonical tags can still leave a page indexed, they simply pass authority for Google. From your question I can tell you know this, but I do have to say that 301's are the best way to address this. Blocking a page with robots.txt can help as well, but this just stops Google from crawling a page, the page can still be indexed again.
If you have pages or versions of pages that you do not want indexed you may want to use the no index meta tag. Google's notes here. Be careful though, this will stop these pages from being indexed, but they will still be crawled (though your rel=canonical solution should make this a non-issue).
A few other notes:
In all cases, be sure your internal links point consistently to the URL version you have determined for your home page.
WMT also creates a list of inbound links that are missing or broken. You can use this to help determine any additional 301s that you need.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google didn't show my correct language-version homepage.
I have a website which serves two languages - English and Chinese. My English homepage can be indexed by Google. But when I search the brand term in English, Google returns my Chinese homepage. I already added the hreflang attributes. And I'm working on building the XML sitemap for three languages. What other things I can work on to fix the issue? Thanks!
Technical SEO | | jsteimle0 -
Why does Google's search results display my home page instead of my target page?
Why does Google's search results display my home page instead of my target page?
Technical SEO | | h.hedayati6712365410 -
Canonical URL Tag: Confusing Use Case
We have a webpage that changes content each evening at mid-night -- let's call this page URL /foo. This allows a user to bookmark URL /foo and obtain new content each day. In our case, the content on URL /foo for a given day is the same content that exists on another URL on our website. Let's say the content for November 5th is URL /nov05, November 6th is /nov06 and so on. This means on November 5th, there are two pages on the website that have almost identical content -- namely /foo and /nov05. This is likely a duplication of content violation in the view of some search engines. Is the Canonical URL Tag designed to be used in this situation? The page /nov05 is the permanent page containing the content for the day on the website. This means page /nov05 should have a Canonical Tag that points to itself and /foo should have a Canonical Tag that points to /nov05. Correct? Now here is my problem. The page at URL /foo is the fourth highest page authority on our 2,000+ page website. URL /foo is a key part of the marketing strategy for the website. It has the second largest number of External Links second only to our home page. I must tell you that I'm concerned about using a Cononical URL Tag that points away from the URL /foo to a permanent page on the website like /nov05. I can think of a lot of things negative things that could happen to the rankings of the page by making a change like this and I am not sure what we would gain. Right now /foo has a Canonical URL Tag that points to itself. Does anyone believe we should change this? If so, to what and why? Thanks for helping me think this through! Greg
Technical SEO | | GregSims0 -
Why is Google's cache preview showing different version of webpage (i.e. not displaying content)
My URL is: http://www.fslocal.comRecently, we discovered Google's cached snapshots of our business listings look different from what's displayed to users. The main issue? Our content isn't displayed in cached results (although while the content isn't visible on the front-end of cached pages, the text can be found when you view the page source of that cached result).These listings are structured so everything is coded and contained within 1 page (e.g. http://www.fslocal.com/toronto/auto-vault-canada/). But even though the URL stays the same, we've created separate "pages" of content (e.g. "About," "Additional Info," "Contact," etc.) for each listing, and only 1 "page" of content will ever be displayed to the user at a time. This is controlled by JavaScript and using display:none in CSS. Why do our cached results look different? Why would our content not show up in Google's cache preview, even though the text can be found in the page source? Does it have to do with the way we're using display:none? Are there negative SEO effects with regards to how we're using it (i.e. we're employing it strictly for aesthetics, but is it possible Google thinks we're trying to hide text)? Google's Technical Guidelines recommends against using "fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash." If we were to separate those business listing "pages" into actual separate URLs (e.g. http://www.fslocal.com/toronto/auto-vault-canada/contact/ would be the "Contact" page), and employ static HTML code instead of complicated JavaScript, would that solve the problem? Any insight would be greatly appreciated.Thanks!
Technical SEO | | fslocal0 -
Should I change by URL's
I started with a static website and then moved to Wordpress. At the time I had a few hundred pages and wanted to keep the same URL structure so I use a plugin that adds .html to every page. Should I change the structure to a more common URL structure and do 301 directs from the .html page to the regular page?
Technical SEO | | JillB20130 -
At what point is the canonical tag crawled
Do search engines (specifically Google) crawl the url in the canonical tag as it loads or do they load the whole page before crawling it? Thanks,
Technical SEO | | ao.com0 -
We changed the URL structure 10 weeks ago and Google hasn't indexed it yet...
We recently modified the whole URL structure on our website, which resulted in huge amount of 404 pages changing them to nice human readable urls. We did this in the middle of March - about 10 weeks ago... We used to have around 5000 404 pages in the beginning, but this number is decreasing slowly. (We have around 3000 now). On some parts of the website we have also set up a 301 redirect from the old URLs to the new ones, to avoid showing a 404 page thus making the “indexing transmission”, but it doesn’t seem to have made any difference. We've lost a significant amount of traffic, because of the URL changes, as Google removed the old URLs, but hasn’t indexed our new URLs yet. Is there anything else we can do to get our website indexed with the new URL structure quicker? It might also be useful to know that we are a page rank 4 and have over 30,000 unique users a month so I am sure Google often comes to the site quite often and pages we have made since then that only have the new url structure are indexed within hours sometimes they appear in search the next day!
Technical SEO | | jack860 -
Should Canonical URLs be used in Wordpress?
Wordpress offers Canonical URLs in the "All in one SEO" settings. I know that canonical tags for page content will cause the search engine to ignore the content, but I don't understand this setting in Wordpress. The Canonical URLs box for my blog had been checked until a couple weeks ago. I unchecked it (removing the canonical tag) and now I have about 300 duplicate content pages acccording to my SEOMoz reports. It appears that it's just the blog tag in the url now that is causing the confusion. Here's an example of the same url with two tags: http://www.rmtracking.com/blog/tag/aclu/ http://www.rmtracking.com/blog/tag/rfid/ Should I activate the canonical URL setting in Wordpress again. If not, how can I fix this? Your assistance is greatly appreciated. Regards, Brad
Technical SEO | | BradBorst0