URL Parameter Being Improperly Crawled & Indexed by Google
-
Hi All,
We just discovered that Google is indexing a subset of our URL’s embedded with our analytics tracking parameter. For the search “dresses” we are appearing in position 11 (page 2, rank 1) with the following URL:
www.anthropologie.com/anthro/category/dresses/clothes-dresses.jsp?cm_mmc=Email--Anthro_12--070612_Dress_Anthro-_-shop
You’ll note that “cm_mmc=Email” is appended. This is causing our analytics (CoreMetrics) to mis-attribute this traffic and revenue to Email vs. SEO.
A few questions:
1) Why is this happening? This is an email from June 2012 and we don’t have an email specific landing page embedded with this parameter. Somehow Google found and indexed this page with these tracking parameters. Has anyone else seen something similar happening?
2) What is the recommended method of “politely” telling Google to index the version without the tracking parameters? Some thoughts on this:
a. Implement a self-referencing canonical on the page.
- This is done, but we have some technical issues with the canonical due to our ecommerce platform (ATG). Even though page source code looks correct, Googlebot is seeing the canonical with a JSession ID.
b. Resubmit both URL’s in WMT Fetch feature hoping that Google recognizes the canonical.
- We did this, but given the canonical issue it won’t be effective until we can fix it.
c. URL handling change in WMT
- We made this change, but it didn’t seem to fix the problem
d. 301 or No Index the version with the email tracking parameters
- This seems drastic and I’m concerned that we’d lose ranking on this very strategic keywordThoughts?
Thanks in advance,
Kevin
-
Hey jStrong,
Thanks for your response.
I was thinking along the same lines, but I'm TERRIFIED of losing rank for this keyword. Technically, you're correct. However, what Google actually does can sometimes be questionable.
I think we'll test this out on one of our lower volume and less strategic keywords and see how Google reacts.
I'll respond to this thread once we get results back.
Thanks again!
Kevin
-
Hi Kevin,
I have seen URLs get picked up sometimes by google that are seemingly nowhere to be found. In this case I would setup the 301 redirect. The page being redirected to has the canonical so that tells google this is the correct page to index. The 301 also tells google that the current page being indexed is no longer valid and that it should update the SERP to display the correct page instead. There may be a chance you lose some ranking, but if the content is the same, I would think this is minimal as is stated in this Moz article about redirects and you could probably regain any lost ranking relatively quickly.
Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to deal with parameter URLs as primary internal links and not canonicals? Weird situation inside...
So I have a weird situation, and I was hoping someone could help. This is for an ecommerce site. 1. Parameters are used to tie Product Detail Pages (PDP) to individual categories. This is represented in the breadcrumbs for the page and the use of a categoryid. One product can thus be included in multiple categories. 2. All of these PDPs have a canonical that does not include the parameter / categoryid. 3. With very few exceptions, the canonical URL for the PDPs are not linked to. Instead, the parameter URL is to tie it to a specific category. This is done primarily for the sake of breadcrumbs it seems. One of the big issues we've been having is the canonical URLs not being indexed for a lot of the products. In some instances, the canonicals _are _indexed alongside parameters, or just parameter URLs are indexed. It's all very...mixed up, I suppose. My theory is that the majority of canonical URLs not being linked to anywhere on the site is forcing Google to put preference on the internal link instead. My problem? **I have no idea what to recommend to the client (who will not change the parameter setup). ** One of our Technical SEOs recommended we "Use cookies instead of parameters to assign breadcrumbs based on how the PDP is accessed." I have no experience this. So....yeah. Any thoughts? Suggestions? Thanks in advance.
Intermediate & Advanced SEO | | Alces0 -
Will Google recognize a canonical to a re-directed URL works?
A third party canonicalizes to our content, and we've recently needed to re-direct that content to a new URL. The third party is going to take some time updating their canonicals, and I am wondering if search engines will still recognize the canonical even though there is a re-direct in place?
Intermediate & Advanced SEO | | nicole.healthline0 -
Proper 301 in Place but Old Site Still Indexed In Google
So i have stumbled across an interesting issue with a new SEO client. They just recently launched a new website and implemented a proper 301 redirect strategy at the page level for the new website domain. What is interesting is that the new website is now indexed in Google BUT the old website domain is also still indexed in Google? I even checked the Google Cached date and it shows the new website with a cache date of today. The redirect strategy has been in place for about 30 days. Any thoughts or suggestions on how to get the old domain un-indexed in Google and get all authority passed to the new website?
Intermediate & Advanced SEO | | kchandler0 -
Best way for Google and Bing not to crawl my /en default english pages
Hi Guys, I just transferred my old site to a new one and now have sub folder TLD's. My default pages from the front end and sitemap don't show /en after www.mysite.com. The only translation i have is in spanish where Google will crawl www.mysite.com/es (spanish). 1. On the SERPS of Google and Bing, every url that is crawled, shows the extra "/en" in my TLD. I find that very weird considering there is no physical /en in my urls. When i select the link it automatically redirects to it's default and natural page (no /en). All canonical tags do not show /en either, ONLY the SERPS. Should robots.txt be updated to "disallow /en"? 2. While i did a site transfer, we have altered some of the category url's in our domain. So we've had a lot of 301 redirects, but while searching specific keywords in the SERPS, the #1 ranked url shows up as our old url that redirects to a 404 page, and our newly created url shows up as #2 that goes to the correct page. Is there anyway to tell Google to stop showing our old url's in the SERP's? And would the "Fetch as Google" option in GWT be a great option to upload all of my url's so Google bots can crawl the right pages only? Direct Message me if you want real examples. THank you so much!
Intermediate & Advanced SEO | | Shawn1240 -
Custom Google Search & Joomla/Wordpress
If you install google custom search on a site - does it record a list of all the searches people type into the search box? Is there a Joomla & Wordpress Search plugin/extension that keeps a track of the search history used on your site(s).
Intermediate & Advanced SEO | | JohnW-UK0 -
Google Indexing Feedburner Links???
I just noticed that for lots of the articles on my website, there are two results in Google's index. For instance: http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html and http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+thewebhostinghero+(TheWebHostingHero.com) Now my Feedburner feed is set to "noindex" and it's always been that way. The canonical tag on the webpage is set to: rel='canonical' href='http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html' /> The robots tag is set to: name="robots" content="index,follow,noodp" /> I found out that there are scrapper sites that are linking to my content using the Feedburner link. So should the robots tag be set to "noindex" when the requested URL is different from the canonical URL? If so, is there an easy way to do this in Wordpress?
Intermediate & Advanced SEO | | sbrault740 -
Can I, in Google's good graces, check for Googlebot to turn on/off tracking parameters in URLs?
Basically, we use a number of parameters in our URLs for event tracking. Google could be crawling an infinite number of these URLs. I'm already using the canonical tag to point at the non-tracking versions of those URLs....that doesn't stop the crawling tho. I want to know if I can do conditional 301s or just detect the user agent as a way to know when to NOT append those parameters. Just trying to follow their guidelines about allowing bots to crawl w/out things like sessionID...but they don't tell you HOW to do this. Thanks!
Intermediate & Advanced SEO | | KenShafer0 -
Magento Hidden Products & Google Not Found Errors
We recently moved our website over to the Magento eCommerce platform. Magento has functionality to make certain items not visible individually so you can, for example, take 6 products and turn it into 1 product where a customer can choose their options. You then hide all the individual products, leaving only that one product visible on the site and reducing duplicate content issues. We did this. It works great and the individual products don't show up in our site map, which is what we'd like. However, Google Webmaster Tools has all of these individual product URLs in its Not Found Crawl Errors. ! For example: White t-shirt URL: /white-t-shirt Red t-shirt URL: /red-t-shirt Blue t-shirt URL: /blue-t-shirt All of those are not visible on the site and the URLs do not appear in our site map. But they are all showing up in Google Webmaster Tools. Configurable t-shirt URL: /t-shirt This product is the only one visible on the site, does appear on the site map, and shows up in Google Webmaster Tools as a valid URL. ! Do you know how it found the individual products if it isn't in the site map and they aren't visible on the website? And how important do you think it is that we fix all of these hundreds of Not Found errors to point to the single visible product on the site? I would think it is fairly important, but don't want to spend a week of man power on it if the returns would be minimal. Thanks so much for any input!
Intermediate & Advanced SEO | | Marketing.SCG0