Can I, in Google's good graces, check for Googlebot to turn on/off tracking parameters in URLs?
-
Basically, we use a number of parameters in our URLs for event tracking. Google could be crawling an infinite number of these URLs. I'm already using the canonical tag to point at the non-tracking versions of those URLs....that doesn't stop the crawling tho.
I want to know if I can do conditional 301s or just detect the user agent as a way to know when to NOT append those parameters.
Just trying to follow their guidelines about allowing bots to crawl w/out things like sessionID...but they don't tell you HOW to do this.
Thanks!
-
No problem Ashley!
It sounds like that would fall under cloaking, albeit pretty benign as far as cloaking goes. There's some more info here. The Matt Cutts video on that page has a lot of good information. Apparently any cloaking is against Google's guidelines. I would suspect you could get away with it, but I'd be worried everyday about a Google penalty getting handed down.
-
The syntax is correct. Assuming the site: and inurl: operators work in Bing, as they do in Google, then Bing is not indexing URLs with the parameters.
That article you've referred to only tells how to sniff out Google...one of a couple. What it doesn't tell me, unfortunately, is if there are any consequences of doing so and taking some kind of action...like shutting off the event tracking parameters in this case.
Just to be clear...thanks a bunch for helping out!
-
My sense from what you told me is that canonicals should be working in your case. What you're trying to use them for is what they're intended to do. You're sure the syntax is correct, and they're in the of the page or being set in the HTTP header?
Google does set it up so you can sniff out Googlebot and return different content (see here), but that would be unusual to do given the circumstances. I doubt you'd get penalized for cloaking for redirecting parameterized URLs to canonical ones for only Googlebot, but I'd still be nervous about doing it.
Just curious, is Bing respecting the canonicals?
-
Yeah, we can't noindex anything because there literally is NO way to crawl the site without picking up tracking parameters.
So we're saying that there is literally no good/approved way to say "oh look, it's google. let's make sure we don't put any of these params on the URL."? Is that the consensus?
-
If these duplicate pages have URLs that are appearing in search results, then the canonicals aren't working or Google just hasn't tried to reindex those pages yet. If the pages are duplicates, and you've set the canonical correctly, and entered them in Google Webmaster Tools, over time those pages should drop out of the index as Google reindexes them. You could try submitting a few of these URLs with parameters to Google to reindex manually in Google Webmaster Tools, and see if afterward they disappear from the results pages. If they do, then it's just a matter of waiting for Googlebot to find them all.
If that doesn't work, you could try something tricky like adding meta noindex tags to the pages with URL parameters, wait until they fall out of the index, and then add canonical tags back on, and see if those pages come back into the SERPs. If they do, then Google is ignoring your canonical tags. I hate to temporarily noindex any pages like this... but if they're all appearing separately in the SERPs anyhow, then they're not pooling their link juice properly anyway.
-
Thank you for your response. Even if I tell them that the parameters don't alter content, which I have, that doesn't stop how many pages google has to crawl. That's my main concern...that googlebot is spending too much time on these alternate URLs.
Plus there are millions of these param-laden URLs in the index, regardless of the canonical tag. There is currently no way for google to crawl the site without parameters that change constantly throughout each visit. This can't be optimal.
-
You're doing the right thing by adding canonicals to those pages. You can also go into Google Webmaster Tools and let them know that those URL parameters don't change the content of the pages. This really is the bread and butter of canonical tags. This is the problem they're supposed to solve.
I wouldn't sniff out Googlebot just to 301 those URLs with parameters to the canonical versions. The canonicals should be sufficient. If you do want to sniff out Googlebot, Google's directions are here. You don't do it by user agent, you do a reverse DNS lookup. Again, I would not do this in your case.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Client wants to remove mobile URLs from their sitemap to avoid indexing issues. However this will require SEVERAL billing hours. Is having both mobile/desktop URLs in a sitemap really that detrimental to search indexing?
We had an enterprise client ask to remove mobile URLs from their sitemaps. For their website both desktop & mobile URLs are combined into one sitemap. Their website has a mobile template (not a responsive website) and is configured properly via Google's "separate URL" guidelines. Our client is referencing a statement made from John Mueller that having both mobile & desktop sitemaps can be problematic for indexing. Here is the article https://www.seroundtable.com/google-mobile-sitemaps-20137.html
Intermediate & Advanced SEO | | RosemaryB
We would be happy to remove the mobile URLs from their sitemap. However this will unfortunately take several billing hours for our development team to implement and QA. This will end up costing our client a great deal of money when the task is completed. Is it worth it to remove the mobile URLs from their main website to be in adherence to John Mueller's advice? We don't believe these extra mobile URLs are harming their search indexing. However we can't find any sources to explain otherwise. Any advice would be appreciated. Thx.0 -
Why isn't Google indexing this site?
Hello, Moz Community My client's site hasn't been indexed by Google, although it was launched a couple of months ago. I've ran down the check points in this article https://moz.com/ugc/8-reasons-why-your-site-might-not-get-indexed without finding a reason why. Any sharp SEO-eyes out there who can spot this quickly? The url is: http://www.oldermann.no/ Thank you
Intermediate & Advanced SEO | | Inevo
INEVO, digital agency0 -
No designated 404 page, but any made-up URL path displays homepage Good / Bad?
I have a custom website where if you type in companyxyz.com/_any-made-up-url _it displays the homepage. So then you will see the homepage and in the URL bar the made up URL path remains visible "companyxyz.com/any-made-up-url" Is this good or bad or not an issue?
Intermediate & Advanced SEO | | Rich_Coffman0 -
Duplicate Content with URL Parameters
Moz is picking up a large quantity of duplicate content, consists mainly of URL parameters like ,pricehigh & ,pricelow etc (for page sorting). Google has indexed a large number of the pages (not sure how many), not sure how many of them are ranking for search terms we need. I have added the parameters into Google Webmaster tools And set to 'let google decide', However Google still sees it as duplicate content. Is it a problem that we need to address? Or could it do more harm than good in trying to fix it? Has anyone had any experience? Thanks
Intermediate & Advanced SEO | | seoman100 -
Website penalised can't find where the problem is. Google went INSANE
Hello, I desperately need a hand here! Firstly I just want to say that I we never infracted google guidelines as far as we know. I have been around in this field for about 6 years and have had success with many websites on the way relying only in natural SEO and was never penalised until now. The problem is that our website www.turbosconto.it is and we have no idea why. (not manual) The web has been online for more than 6 months and it NEVER started to rank. it has about 2 organic visits a day at max. In this time we got several links from good websites which are related to our topic which actually keep sending us about 50 visits a day. Nevertheless our organic visita are still 1 or 2 a day. All the pages seem to be heavily penalised ... when you perform a search for any of our "shops"even including our Url, no entries for the domain appear. A search example: http://www.turbosconto.it zalando What I will expect to find as a result: http://www.turbosconto.it/buono-sconto-zalando The same case repeats for all of the pages for the "shops" we promote. Searching any of the brads + our domain shows no result except from "nike" and "euroclinix" (I see no relationship between these 2) Some days before for these same type of searches it was showing pages from the domain which we blocked by robots months ago, and which go to 404 error instead of our optimised landing pages which cannot be found in the first 50 results. These pages are generated by our rating system... We already send requests to de index all theses pages but they keep appearing for every new page that we create. And the real pages nowhere to be found... Here isan example: http://www.turbosconto.it/shops/codice-promozionale-pimkie/rat
Intermediate & Advanced SEO | | sebastiankoch
You can see how google indexes that for as in this search: site:www.turbosconto.it rate Why on earth will google show a page which is blocked by the robots.txt displaying that the content cannot retrieved because it is blocked by the robots instead of showing pages which are totally SEO Friendly and content rich... All the script from TurboSconto is the same one that we use in our spanish version www.turbocupones.com. With this last one we have awesome results, so it makes things even more weird... Ok apart from those weird issues with the indexation and the robots, why did a research on out backlinks and we where surprised to fin a few bad links that we never asked for. Never the less there are just a few and we have many HIGH QUALITY LINKS, which makes it hard to believe that this could be the reason. Just to be sure we, we used the disavow tool for these links, here are the bad links we submitted 2 days ago: domain: www.drilldown.it #we did not ask for this domain: www.indicizza.net #we did not ask for this domain: urlbook.in #we did not ask for this, moreover is a spammy one http://inpe.br.way2seo.org/domain-list-878 #we did not ask for this, moreover is a spammy one http://shady.nu.gomarathi.com/domain-list-789 #we did not ask for this, moreover is a spammy one http://www.clicdopoclic.it/2013/12/i-migliori-siti-italiani-di-coupon-e.html #we did not ask for this, moreover and is a copy of a post of an other blog http://typo.domain.bi/turbosconto.it I have no clue what can it be, we have no warning messages in the webmaster tools or anything.
For me it looks as if google has a BUG and went crazy on judging our italian website. Or perhaps we are just missing something ??? If anyone could throw some light on this I will be really glad and willing to pay some compensation for the help provided. THANKS A LOT!0 -
Are This Site's Backlinks Hurting Us?
Google WMT reports more than 198,000 backlinks to our site (www.audiobooksonline.com) from http://dilandau.eu/? We have never been notified by Google of any penalty, malware notification... but continue to struggle to get our page 1 Google ranking back since Panda. Could these backlinks be hurting our Google ranking? Should I implement a disavow rule for http://dilandau.eu/?
Intermediate & Advanced SEO | | lbohen0 -
Killing 404 errors on our site in Google's index
Having moved a site across to Magento, obviously re-directs were a large part of that, ensuring all the old products and categories linked up correctly with the new site structure. However, we came up against an issue where we needed to add, delete, then re-add products. This, coupled with a misunderstanding of the csv upload processing, meant that although the old urls redirected, some of the new Magento urls changed and then didn't redirect: For Example: mysite/product would get deleted re-added and become: mysite/product-1324 We now know what we did wrong to ensure it doesn't continue to happen if we weret o delete and re-add a product, but Google contains all these old URLs in its index which has caused people to search for products on Google, click through, then land on the 404 page - far from ideal. We kind of assumed, with continual updating of sitemaps and time, that Google would realise and update the URL accordingly. But this hasn't happened - we are still getting plenty of 404 errors on certain product searches (These aren't appearing in SEOmoz, there are no links to the old URL on the site, only Google, as the index contains the old URL). Aside from going through and finding the products affected (no easy task), and setting up redirects for each one, is there any way we can tell Google 'These URLs are no longer a thing, forget them and move on, let's make a fresh start and Happy New Year'?
Intermediate & Advanced SEO | | seanmccauley0 -
Creating 100,000's of pages, good or bad idea
Hi Folks, Over the last 10 months we have focused on quality pages but have been frustrated with competition websites out ranking us because they have bigger sites. Should we focus on the long tail again? One option for us is to take every town across the UK and create pages using our activities. e.g. Stirling
Intermediate & Advanced SEO | | PottyScotty
Stirling paintball
Stirling Go Karting
Stirling Clay shooting We are not going to link to these pages directly from our main menus but from the site map. These pages would then show activities that were in a 50 mile radius of the towns. At the moment we have have focused our efforts on Regions, e.g. Paintball Scotland, Paintball Yorkshire focusing all the internal link juice to these regional pages, but we don't rank high for towns that the activity sites are close to. With 45,000 towns and 250 activities we could create over a million pages which seems very excessive! Would creating 500,000 of these types of pages damage our site? This is my main worry, or would it make our site rank even higher for the tougher keywords and also get lots of traffic from the long tail like we used to get. Is there a limit to how big a site should be? edit0