Disallowed "Search" results with robots.txt and Sessions dropped
-
Hi
I've started working on our website and I've found millions of "Search" URL's which I don't think should be getting crawled & indexed (e.g. .../search/?q=brown&prefn1=brand&prefv1=C.P. COMPANY|AERIN|NIKE|Vintage Playing Cards|BIALETTI|EMMA PAKE|QUILTS OF DENMARK|JOHN ATKINSON|STANCE|ISABEL MARANT ÉTOILE|AMIRI|CLOON KEEN|SAMSONITE|MCQ|DANSE LENTE|GAYNOR|EZCARAY|ARGOSY|BIANCA|CRAFTHOUSE|ETON).I tried to disallow them on the Robots.txt file, but our Sessions dropped about 10% and our Average Position on Search Console dropped 4-5 positions over 1 week. Looks like over 50 Million URL's have been blocked, and all of them look like all of them are like the example above and aren't getting any traffic to the site.
I've allowed them again, and we're starting to recover. We've been fixing problems with getting the site crawled properly (Sitemaps weren't added correctly, products blocked from spiders on Categories pages, canonical pages being blocked from Crawlers in robots.txt) and I'm thinking Google were doing us a favour and using these pages to crawl the product pages as it was the best/only way of accessing them.
Should I be blocking these "Search" URL's, or is there a better way about going about it??? I can't see any value from these pages except Google using them to crawl the site.
-
If you have a site with, at least 30k URLs, looking at only 300 keywords won't reflect the general status of the whole site. If you are looking for a 10% loss in traffic, I'd start by chasing the pages that lost more traffic, then analyzing whether they lost rankings or if there are some other issues.
Another way to find where there is traffic loss is in search Console, looking at keywords that aren't in the top300. There might be a lot to analyze.
It's not a big deal having a lot of pages blocked in robots.txt when what's blocked is correctly blocked. Keep in mind that GSC will flag those pages with warnings as they were previously indexed and now are blocked. That's just how they've set up flags.
Hope it helps.
Best luck.
Gaston -
If you have a general site which happens to have a search facility, blocking search results is quite usual. If your site is all 'about' searching (e.g: Compare The Market, stuff like that) then the value-add of your site is how it helps people to find things. In THAT type of situation, you absolutely do NOT want to block all your search URLs
Also, don't rule out seasonality. Traffic naturally goes up and down, especially at this time of year when everyone is on holiday. How many people spend their holidays buying stuff or doing business stuff online? They're all at the beach - mate!
-
Hi Gaston
"Search/" pages were getting a small amount of traffic, and a tiny bit of revenue, but I definitely don't think they need to be indexed or are important to users. We're down in mainly "Sale" & "Brand" pages, and I've heard the Sale in general across the store isn't going well, but don't think I can go back management with that excuse
I think my sitemaps are sorted now, I've broken them down into 6 x 5,000 URL files, and all the canonical tags seem to be fine and pointing to these URL's. I am a bit concerned that URL's "blocked by robots.txt" shot up from 12M to 73M, although all the URLs Search Console are showing me look like they need to be blocked!
We've also tracking nearly 300 Keywords, and they've actually had good improvements in the same period. Finding it hard to explain it!
-
Hi Frankie,
My guess is that the traffic you were losing was because of its traffic driven by /search pages.
The questions you should be asking are:
- Are those /search pages getting traffic?
- Are them important to users?
- After being disallowed, which pages were losing traffic?
As a general rule, Google doesn't want to crawl nor index internal search pages, unless they have some value to users.
On another matter, the crawlability of your product pages can be easily solved with a sitemap file. If you are worried about the size of it, remember that it can contain up to 50k URLs and you can create several sitemaps and list them in a sitemap index.
More info about that here: Split up your large sitemaps - Google Search Console HelpHope it helps.
Best luck,
Gaston
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ranking drop for "Mobile" devices category in Google webmaster tools
Hi, Our rank dropped and we noticed it's a major drop in "Mobile" devices category, which is contributing to the overall drop. What exactly drops mobile rankings? We do not have any messages in search console. We have made few redirects and removed footer links. How these affect? Thanks,
Intermediate & Advanced SEO | | vtmoz
Satish0 -
WordPress – parent category "blog" instead of regular "post page"?
In WordPress you normally show you blog posts on: Your home page. Your "posts page" (configurable in the Reading Settings) I want to do neither and have a third option instead: Assign a parent category called "blog" for all posts, and show the latest posts on that category's archive page. For the readers, the experience will be 100% the same as a regular "posts page". The UI, permalinks, and breadcrumbs will be 100% the same. But, I have heard that the "posts page" is important for Google for indexing and understanding your blog. So is is smarter SEO-wise to use a "posts page" instead of a parent category named "blog"? What negative effects might there be, if I have no "posts page" and just use the parent category "blog" instead?
Intermediate & Advanced SEO | | NikolasB0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
Use Canonical or Robots.txt for Map View URL without Backlink Potential
I have a Page X with lots of unique content. This page has a "Map view" option, which displays some of the info from Page X, but a lot is ommitted. Questions: Should I add canonical even though Map View URL does not display a lot of info from Page X or adding to robots.txt or noindex, follow? I don't see any back links coming to Map View URL Should Map View page have unique H1, title tag, meta des?
Intermediate & Advanced SEO | | khi50 -
With or without the "www." ?
Is there any benefit whatsoever to having the www. in the URL?
Intermediate & Advanced SEO | | JordanBrown0 -
Does my website have an Exact Match Domain or a "brand"?
I'd like to get some input from the Moz community about the domain name I use on a travel website I run as a hobby. I got heavily whacked by an update in September 2012 which some have said was because my site is an EMD. Others said it was because I had poor quality backlinks (but in fact I hardly had any). With the benefit of hindsight, I'd love to know what really happened. The website is www.traveltipsthailand.com (now www.asiantraveltips.com) and the "brand" I use is "Travel Tips Thailand.The traffic penalty I incurred was around 80% and despite a LOT of work overhauling the site and trying to build some better quality links, I don't believe it has really recovered much. It ranks for non-competitive, low-traffic key phrases (which means it's not penalised as such), but struggles to rank anywhere meaningful on any phrase likely to drive traffic to the site. At this stage I really just want to know whether to persist with the site (it's heartbreaking, to be honest) or drop it an build something new from scratch. I monitor the site's progress using Moz Pro, so I can see all the search ranking, authority and backlink data. 5254ab15dcaa91-52423790
Intermediate & Advanced SEO | | Gavin.Atkinson0 -
How use Rel="canonical" for our Website
How is the best way to use Rel="canonical" for our website www.ofertasdeemail.com.br, for we can say goodbye for duplicated pages? I appreciate for every help. I also hope to contribute to the SEOmoz community. Sincerely,
Intermediate & Advanced SEO | | ZZNINTERNETMEDIAGROUP
Amador Goncalves0 -
301 redirect or Robots.txt on an interstatial page
Hey guys, I have an affiliate tracking system that works like this : an affiliate puts up a certain code on his site, for example : www.domain.com/track/aff_id This url leads to a page where the hit is counted, analysed and then 302 redirects to my sales page with the affiliates ID in the url : www.mysalespage.com/?=aff_id. However, we've noticed recently that one affiliate seems to be ranking for our own name and the url google indexed was his tracking url (domain.com/track/aff_id). Which is strange because there is absolutely nothing on that page, its just an interstatial page so that our stats tracking software can properly filter hits. To remove the affiliate's url from showing up in the serps, I've come up with 2 solutions : 1 - Change the redirect to a 301 redirect on his track page. 2 - Change our robots.txt page to block all domain.com/track/ pages from being indexed. My question is : if I 301 redirect instead of 302, will I keep the affiliates from outranking me for my own name AND pass on link juice or should I simply block google from crawling the interstatial tracking pages?
Intermediate & Advanced SEO | | CrakJason0