Block all search results (dynamic) in robots.txt?
-
I know that google does not want to index "search result" pages for a lot of reasons (dup content, dynamic urls, blah blah). I recently optimized the entire IA of my sites to have search friendly urls, whcih includes search result pages. So, my search result pages changed from:
- /search?12345&productblue=true&id789
to
- /product/search/blue_widgets/womens/large
As a result, google started indexing these pages thinking they were static (no opposition from me :)), but i started getting WMT messages saying they are finding a "high number of urls being indexed" on these sites. Should I just block them altogether, or let it work itself out?
-
You can block the urls which has term "/product/search/" in them. It can be easily done by adding the following to the robots.txt
User-agent: * Disallow: /product/search/ Hope this helps...
-
As you said: The increasing number of pages indexed will dilute the link juice of the entire site.
Can you give more example? Or just a tip where to search for this kind of information?
Thank you.
-
I would agree with BK Search. You want to minimize what Google has to crawl (I know this sounds backwards) so that Google focuses on the pages that you want to rank.
Long term, why would you waste GoogleBot's time on pages that don't matter as much? What if you had an update on a more important page and GoogleBot is too busy indexing this infinite loop of pages.
At this point, I would use the noindex meta tag vs robots.txt so that google will crawl and remove all the urls from the index. Then you can drop it in later into robots.txt so it will stop crawling. Otherwise you may end up with a lot of junk in the index.
-
I might be a little different than some of these answers but I would recommend that you exclude them from getting indexed.
The reasons I would do that are that:
You know it is largely duplicate content and goes down to the same pages as your categories.
Google has stated that they would prefer to not have it indexed.
The increasing number of pages indexed will dilute the link juice of the entire site.
There is also the possibility that people using the url bar of their browser will start to increase the number of pages indexed by a large manner.
A competitor could create thousands of links to these pages and create a huge footprint that is search pages.
And finally, I like having product pages ranking highly if at all possible.
I would do this with both the robots.txt file and the GWMT exclusion on /product/search/ directory
Good Luck!
-
Hi! We're going through some of the older unanswered questions and seeing if people still have questions or if they've gone ahead and implemented something and have any lessons to share with us. Can you give an update, or mark your question as answered?
Thanks!
-
As a follow-up or further info: Its been about 5 months since the change. I do get some traffic from these indexed pages (not a ton, but enough that i would like to not block if there is no negative impact). The SE interaction seems to be confusion- they index the content, but also recognize that something may not be right. So I am wondering if anyone else has done something similar or is trying this.
Admitidly this is what i wanted the new url structure to do- as an experiment. Just looking for anyone else who has/is doing similar
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Geographical Results in Universal SERPS
Hi Moz Pros! I have been reading on the board for quite some time, despite all the insights you all share with us in the SEO world. I have a nut I can't crack and thought I would ask. Does any guru here know the factors google uses when they choose sites to add to GEO specific SERPS on a general query? Here is an example. "Car insurance companies" Also attached. https://www.google.com/search?sxsrf=ACYBGNRvHrdh6z6sNatu-Pgbh-tbgiiQLQ%3A1569462097607&ei=UReMXbPfJIfwtAX_77aYCQ&q=car+insurance+companies+&oq=car+insurance+companies+&gs_l=psy-ab.3..0i71l8.321.321..471...0.2..0.0.0.......0....1..gws-wiz.8MmdfVjHrt4&ved=0ahUKEwjzy8P2re3kAhUHOK0KHf-3DZMQ4dUDCAs&uact=5
Intermediate & Advanced SEO | | TicksTire
I live in Austin obviously 🙂 Thanks for any input you'd be willing to share! 5Brenxf0 -
Can we compete for both videos and text results?
Hi, We have a ecommerce website that performs very well for our brand pages on the text results including the reviews snippet. Our brand pages also include embedded videos. Until now we have always ranked poorly on video results. Our videos are hosted over youtube. In order to boost our video result we have recently submitted a video sitemap to help crawlers find out our videos. The result is the following : our brand pages are now only competing in the video results space. Instead of showing as a text result with our reviews snippet, it shows as a video in a carrousel widget. Within the video tab we are ranking top. We have experienced a drop in CTR since then. Moz have reported a drop on all our brand keywords for text search although the video widget shows our brand there. Is there a way to compete for both videos results and text results, making the choice to keey the review snippet widget? Is the video sitemap useful only to compete within the video space? Cheers
Intermediate & Advanced SEO | | mattam1 -
Should I disallow all URL query strings/parameters in Robots.txt?
Webmaster Tools correctly identifies the query strings/parameters used in my URLs, but still reports duplicate title tags and meta descriptions for the original URL and the versions with parameters. For example, Webmaster Tools would report duplicates for the following URLs, despite it correctly identifying the "cat_id" and "kw" parameters: /Mulligan-Practitioner-CD-ROM
Intermediate & Advanced SEO | | jmorehouse
/Mulligan-Practitioner-CD-ROM?cat_id=87
/Mulligan-Practitioner-CD-ROM?kw=CROM Additionally, theses pages have self-referential canonical tags, so I would think I'd be covered, but I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs, despite Webmaster Tools not reporting any errors. As I see it, I have two options: Manually tell Google that these parameters have no effect on page content via the URL Parameters section in Webmaster Tools (in case Google is unable to automatically detect this, and I am being penalized as a result). Add "Disallow: *?" to hide all query/parameter URLs from Google. My concern here is that most backlinks include the parameters, and in some cases these parameter URLs outrank the original. Any thoughts?0 -
Dilemma about "images" folder in robots.txt
Hi, Hope you're doing well. I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
Intermediate & Advanced SEO | | Modbargains1 -
Site: inurl: Search
I have a site that allows for multiple filter options and some of these URL's have these have been indexed. I am in the process of adding the noindex, nofollow meta tag to these pages but I want to have an idea of how many of these URL's have been indexed so I can monitor when these have been re crawled and dropped. The structure for these URL's is: http://www.example.co.uk/category/women/shopby/brand1--brand2.html The unique identifier for the multiple filtered URL's is --, however I've tried using site:example.co.uk inurl:-- but this doesn't seem to work. I have also tried using regex but still no success. I was wondering if there is a way around this so I can get a rough idea of how many of these URL's have been indexed? Thanks
Intermediate & Advanced SEO | | GrappleAgency0 -
Rubber Ball Ranking Results
We noticed a few weeks ago that rankings for the phrase
Intermediate & Advanced SEO | | Jayblue
Charity Collection Buckets
Were bouncing between this page
http://www.carefundraisingsupplies.co.uk/fundraising-products/Charity-Collection-Buckets
Rank 16
to this page
http://www.carefundraisingsupplies.co.uk/fundraising-products/fundraising-supplies
Rank 85
So we de-SEO'd the second page and added more content to the first page.
This seemed to lock Google onto the first page at 16, but it then started to slowly slide downwards. We have made a few more on page text tweaks, tried to reduce keywords density all to no avail. Even though overall this site has a better DA and MOZ profile than those ranked 1 and 2 for the phrase, we just cannot seem to get it moving in the right direction. We are just about to apply some quality links to see if that helps. But we are wondering if we are missing something at a technical level, like category structure, Canonicalisation, 301 redirects or something else. Any thoughts?0 -
Do these results indicate a problem with my seo?
I've entered my the following search query into Google.co.uk related:mywebsite.co.uk However the resulting website that are brought back are on the whole nothing like our website, nor do they offer similar services to us. If I run this same query on my competitors websites they all bring back similar websites to each other. I read somewhere that gaining links from the websites that Google believes are similar/related to our own website is beneficial. But looking at our results it would seem that Google can't place what our site is about and which sites are similar. So I'm guessing this is a more pressing matter than link building right now!? Other info about our website: We rank fairly well for a lot of our target keywords.
Intermediate & Advanced SEO | | adamlcasey
Domain age = 11 years
PA =38
mR= 4.77
mT= 5.74
DA:= 31
DmR= 3.78
DmT= 3.84
PageRank = 3 Example of how random the results are the 1st website that comes back in our related websites search is for Doctors GP Practice. Our website sells GPS Telematics Solutions. Can anyone shed any light on this or just to confirm how much of a problem this is?0 -
Why do I have better results with absolute search
Hi, I completed the optimization of my website, and while my rankings have improved, i seem to have consistently better resultats with absolute search terms: ex: i'm better ranked on "word1 word2" than just word1 word2 without the guillemets. Do you have any idea why? As most people perform search without guillemets, i would like to improve my rankings with these searches too. thanks, cedric
Intermediate & Advanced SEO | | smartgrains0