URL Parameter & crawl stats
-
Hey Guys,I recently used the URL parameter tool in WBT to mark different urls that offers the same content.I have the parameter "?source=site1" , "?source=site2", etc...It looks like this: www.example.com/article/12?source=site1The "source parameter" are feeds that we provide to partner sites and this way we can track the referral site with our internal analytics platform.Although, pages like:www.example.com/article/12?source=site1 have canonical to the original page www.example.com/article/12, Google indexed both of the URLs
www.example.com/article/12?source=site1andwww.example.com/article/12Last week I used the URL parameter tool to mark "source" parameter "No, this parameter doesnt effect page content (track usage)" and today I see a 40% decrease in my crawl stats.In one hand, It makes sense that now google is not crawling the repeated urls with different sources but in the other hand I thought that efficient crawlability would increase my crawl stats.In additional, google is still indexing same pages with different source parameters.I would like to know if someone have experienced something similar and by increasing crawl efficiency I should expect my crawl stats to go up or down?I really appreciate all the help!Thanks! -
I wouldn't freak out too much over the crawl rate immediately. Wait a few weeks and see how things go. It sounds like you did the right thing and should see the benefits over the next few weeks.
-
Thanks Martin,
I see what are you saying, but I dont think it is possible to equal the amount of pages been crawled every day with the amount of duplicate pages that I have.
Virtually, every page that I have, have a duplicate version "source=site1", and the decrease was only around 35%.
Another thing that happen and I did not mention is that I recently redirected my cdn.site.com version of the site to the original site.com.
Im thinking that all the new redirect inside the site, could also have effected the crawlability. Any idea?
Today, the crawl stats is a bit higher than yesterday but still under the last 90 average.
Thanks
-
Hi Arie,
Do you have an idea about how many pages were crawled before and what the number of duplicate pages was? Then you could find out if this would clarify the decrease in crawl stats. I've seen it before that making sure that Google isn't able to crawl some pages will decrease the crawl rate so you're probably OK with this.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Parameters
Hi Moz Community, I'm working on a website that has URL parameters. After crawling the site, I've implemented canonical tags to all these URLs to prevent them from getting indexed by Google. However, today I've found out that Google has indexed plenty of URL parameters.. 1-Some of these URLs has canonical tags yet they are still indexed and live. 2- Some can't be discovered through site crawling and they are result in 5xx server error. Is there anything else that I can do (other than adding canonical tags) + how can I discover URL parameters indexed but not visible through site crawling? Thanks in advance!
Intermediate & Advanced SEO | | bbop330 -
Which URL should I choose when combining content?
I am combining content from two similar articles into one. URL 1 has a featured snippet and better URL structure, but only 5,000 page views in the last 6 month, and has 39 keywords ranking in the top 10. URL 2 has worse structure, but over 100k page views in the last 6 months, and 236 keywords in the top 10. Basically, I'm wondering if I keep the one with the better URL structure or the one with more traffic. The deleted URL will be redirected to whichever I keep.
Intermediate & Advanced SEO | | curtis-yakketyyak0 -
Crawl diagnostic issue?
I'am sorry if my English isn't very good, but this is my problem at the moment: On two of my campagnes I get a weird error on Moz Analytics: 605 Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag Moz Analytics points to an url that starts with: http:/**/None/**www.????.com. We don't understand how Moz indexed this non-existing page that starts with None? And how can we solve this error? I hope that someone can help me.
Intermediate & Advanced SEO | | nettt0 -
Yoast & rel canonical for paginated Wordpress URLs
Hello, our Wordpress blog at http://www.jobs.ca/career-resources has a rel canonical issue since we added pagination to the front page and category-pages. We're using Yoast and it's incorrectly applying a rel-canonical meta tag referencing page 1 on page 2, 3, etc. This is a known misuse of the rel-canonical tag (per Google's Webmaster Blog - http://googlewebmastercentral.blogspot.ca/2013/04/5-common-mistakes-with-relcanonical.html, which says rel-canonical should be replaced with rel-prev and rel-next for page 2, 3, etc.). We don't see a way to specify anywhere in Yoast's options to correct this behaviour for page 2, 3, etc. Yoast allows you to override a page's canonical URL, otherwise it automatically uses the Wordpress permalink. My question is, does anyone know how to configure Yoast to properly replace rel-canonical tags with rel-prev and rel-next for paginated URLs, or do I need to look at another plugin or customize the behavior directly in my child theme code? This issue was brought up here as well: http://moz.com/community/q/canonical-help, but the only response did not relate to Yoast. (We're using Wordpress 3.6.1 and Yoast "Wordpress SEO" 1.4.18)
Intermediate & Advanced SEO | | aactive0 -
URL or Domain length
Hi All, I am wondering if google still does give importance to the length of the domain or url. If yes then how much is the acceptable length of a domain and URL. Many Thanks!
Intermediate & Advanced SEO | | HiteshBharucha0 -
Does having a trailing slash make a url different than the same url without the trailing slash?
Does having a trailing slash make a url different than the same url without the trailing slash? www.example.com/services Or www.example.com/services**/** Does Google consider these to be the same link or does Google treat them as different links?
Intermediate & Advanced SEO | | webestate0 -
Canonical URLs and Sitemaps
We are using canonical link tags for product pages in a scenario where the URLs on the site contain category names, and the canonical URL points to a URL which does not contain the category names. So, the product page on the site is like www.example.com/clothes/skirts/skater-skirt-12345, and also like www.example.com/sale/clearance/skater-skirt-12345 in another category. And on both of these pages, the canonical link tag references a 3rd URL like www.example.com/skater-skirt-12345. This 3rd URL, used in the canonical link tag is a valid page, and displays the same content as the other two versions, but there are no actual links to this generic version anywhere on the site (nor external). Questions: 1. Does the generic URL referenced in the canonical link also need to be included as on-page links somewhere in the crawled navigation of the site, or is it okay to be just a valid URL not linked anywhere except for the canonical tags? 2. In our sitemap, is it okay to reference the non-canonical URLs, or does the sitemap have to reference only the canonical URL? In our case, the sitemap points to yet a 3rd variation of the URL, like www.example.com/product.jsp?productID=12345. This page retrieves the same content as the others, and includes a canonical link tag back to www.example.com/skater-skirt-12345. Is this a valid approach, or should we revise the sitemap to point to either the category-specific links or the canonical links?
Intermediate & Advanced SEO | | 379seo0 -
Lots of incorrect urls indexed - Googlebot found an extremely high number of URLs on your site
Hi, Any assistance would be greatly appreciated. Basically, our rankings and traffic etc have been dropping massively recently google sent us a message stating " Googlebot found an extremely high number of URLs on your site". This first highligted us to the problem that for some reason our eCommerce site has recently generated loads (potentially thousands) of rubbish urls hencing giving us duplication everywhere which google is obviously penalizing us with in the terms of rankings dropping etc etc. Our developer is trying to find the route cause of this but my concern is, How do we get rid of all these bogus urls ?. If we use GWT to remove urls it's going to take years. We have just amended our Robot txt file to exclude them going forward but they have already been indexed so I need to know do we put a redirect 301 on them and also a HTTP Code 404 to tell google they don't exist ? Do we also put a No Index on the pages or what . what is the best solution .? A couple of example of our problems are here : In Google type - site:bestathire.co.uk inurl:"br" You will see 107 results. This is one of many lot we need to get rid of. Also - site:bestathire.co.uk intitle:"All items from this hire company" Shows 25,300 indexed pages we need to get rid of Another thing to help tidy this mess up going forward is to improve on our pagination work. Our Site uses Rel=Next and Rel=Prev but no concanical. As a belt and braces approach, should we also put concanical tags on our category pages whereby there are more than 1 page. I was thinking of doing it on the Page 1 of our most important pages or the View all or both ?. Whats' the general consenus ? Any advice on both points greatly appreciated? thanks Sarah.
Intermediate & Advanced SEO | | SarahCollins0