URL Parameter & crawl stats
-
Hey Guys,I recently used the URL parameter tool in WBT to mark different urls that offers the same content.I have the parameter "?source=site1" , "?source=site2", etc...It looks like this: www.example.com/article/12?source=site1The "source parameter" are feeds that we provide to partner sites and this way we can track the referral site with our internal analytics platform.Although, pages like:www.example.com/article/12?source=site1 have canonical to the original page www.example.com/article/12, Google indexed both of the URLs
www.example.com/article/12?source=site1andwww.example.com/article/12Last week I used the URL parameter tool to mark "source" parameter "No, this parameter doesnt effect page content (track usage)" and today I see a 40% decrease in my crawl stats.In one hand, It makes sense that now google is not crawling the repeated urls with different sources but in the other hand I thought that efficient crawlability would increase my crawl stats.In additional, google is still indexing same pages with different source parameters.I would like to know if someone have experienced something similar and by increasing crawl efficiency I should expect my crawl stats to go up or down?I really appreciate all the help!Thanks! -
I wouldn't freak out too much over the crawl rate immediately. Wait a few weeks and see how things go. It sounds like you did the right thing and should see the benefits over the next few weeks.
-
Thanks Martin,
I see what are you saying, but I dont think it is possible to equal the amount of pages been crawled every day with the amount of duplicate pages that I have.
Virtually, every page that I have, have a duplicate version "source=site1", and the decrease was only around 35%.
Another thing that happen and I did not mention is that I recently redirected my cdn.site.com version of the site to the original site.com.
Im thinking that all the new redirect inside the site, could also have effected the crawlability. Any idea?
Today, the crawl stats is a bit higher than yesterday but still under the last 90 average.
Thanks
-
Hi Arie,
Do you have an idea about how many pages were crawled before and what the number of duplicate pages was? Then you could find out if this would clarify the decrease in crawl stats. I've seen it before that making sure that Google isn't able to crawl some pages will decrease the crawl rate so you're probably OK with this.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL structure with dash or slash
Hi, everyone Basically I am editing my website page's URL for SEO Optimisation and I am not sure which URL structure is best for SEO. The main different is the sign ( dash or slash ) before the product-code. HERE ARE TWO EXAMPLE www.example.com/long-tail-keyword-product-code www.example.com/long-tail-keyword/product-code To get more idea of my page, here is one of the product from my website : http://www.okeus.co.uk/pro_view-3.html My website is selling my own product, as a result the only keyword can be found was the name of the product and I separated different design by different code. Any experts who are willing help would be very much appreciated.
Intermediate & Advanced SEO | | chrisyu781 -
AMP for WordPress: To Do Or Not To Do
Hello SEO's, Recently some of my VIPs (Very Important Pages) have slipped, and all the pages above them are AMP. I've been waiting to switch to AMP for as long as possible bc I've heard it's a very mixed bag. As of Oct 2018, what do people think? Is it worth doing? Is there a preferred plugin for wordpress? Are things more likely to go right than wrong? The page that has gotten hit the hardest is https://humanfoodbar.com/plant-paradox-diet/plant-paradox-diet-full-shopping-list-for-lectin-free-diet/. It used to bring in ~70% of organic traffic. It was #1 and is now often near the bottom of the page. 😞 Thanks all! Remy
Intermediate & Advanced SEO | | remytennant1 -
Is AMP works on blogs only?
I have installed AMP Plugin in my WordPress website but when I check pages with /amp/ it shows 404 error. But for blog pages, for the example www.website.com/blog/post/amp/ it shows amp version of the particular page. Also, nothing is showing in search console Accelerate Moile pages.
Intermediate & Advanced SEO | | SEO-Stephanie0 -
What would cause these ⠃︲蝞韤諫䴴SPপ� emblems in my urls?
In Search Console I am getting errors under other. It is showing urls that have this format- https://www.site.com/Item/654321~SURE⠃︲蝞韤諫䴴SPপ�.htm When clicked it shows 蝞韤諫䴴SPপ� instead of the % stuff. As you can see this is an item page and the normal item page pulls up fine with no issues. This doesn't show it is linked from anywhere. Why would google pull this url? It doesn't exist on the site anywhere. It is a custom asp.net site. This started happening in mid May but we didn't make any changes then.
Intermediate & Advanced SEO | | EcommerceSite0 -
Location in URLs question
Hi there, my company is a national theater news publisher. Quick question about a particular use case. When an editor publishes a story they can assign several discrete locations, allowing it to appear on each of those locations within our website. This article (http://www.theatermania.com/denver-theater/news/full-casting-if-then-tour-idina-menzel_74354.html), for example, appears in the Los Angeles, San Francisco, and Denver section. We force the author to choose a primary location from that list, which controls the location displayed in the URL. Is this a bad practice? I'm wondering if the fact that having 'Denver' in the URL is misleading and hurts SEO value, particularly since that article features several other cities.
Intermediate & Advanced SEO | | TheaterMania0 -
URL with a # but no ! being indexed
Given that it contains a #, how come Google is able to index this URL?: http://www.rtl.nl/xl/#/home It was my understanding that Google can't handle # properly unless it's paired with a ! (hash fragment / bang). site:http://www.rtl.nl/xl/#/home returns nothing, but: site:http://www.rtl.nl/xl returns http://www.rtl.nl/xl/#/home in the result set
Intermediate & Advanced SEO | | EdelmanDigital0 -
Google News URL Structure
Hi there folks I am looking for some guidance on Google News URLs. We are restructuring the site. A main traffic driver will be the traffic we get from Google News. Most large publishers use: www.site.com/news/12345/this-is-the-title/ Others use www.example.com/news/celebrity/12345/this-is-the-title/ etc. www.example.com/news/celebrity-news/12345/this-is-the-title/ www.example.com/celebrity-news/12345/this-is-the-title/ (Celebrity is a channel on Google News so should we try and follow that format?) www.example.com/news/celebrity-news/this-is-the-title/12345/ www.example.com/news/celebrity-news/this-is-the-title-12345/ (unique ID no at the end and part of the title URL) www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ Others include the date. So as you can see there are so many combinations and there doesnt seem to be any unity across news sites for this format. Have you any advice on how to structure these URLs? Particularly if we want to been seen as an authority on the following topics: fashion, hair, beauty, and celebrity news - in particular "celebrity name" So should the celebrity news section be www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ or what? This is for a completely new site build. Thanks Barry
Intermediate & Advanced SEO | | Deepti_C0 -
Googlebot crawling partial URLs
Hi guys, I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL. Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'. This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders. Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening.. Thanks!
Intermediate & Advanced SEO | | panini0