Block search engines from URLs created by internal search engine?
-
Hey guys,
I've got a question for you all that I've been pondering for a few days now. I'm currently doing an SEO Technical Audit for a large scale directory.
One major issue that they are having is that their internal search system (Directory Search) will create a new URL everytime a search query is entered by the user. This creates huge amounts of duplication on the website.
I'm wondering if it would be best to block search engines from crawling these URLs entirely with Robots.txt?
What do you guys think? Bearing in mind there are probably thousands of these pages already in the Google index?
Thanks
Kim
-
That sounds perfect - if the user-generated URLs are getting enough traffic, make them permanent pages and 301-redirect or canonical. If not, weed them out of the index.
-
Thanks for your reply Dr. Meyers. I think you're probably right.
Yes I'm recommending they define a canonical set of pages that are the most popular searches, categories and locations which can be reached via internal links and we'll get all those duplicates re-directed back to that canonical set.
But for pages that fall outside those categories and locations, I'll recommend a meta-no-index tag.
-
It can be a complicated question on a very large site, but in most cases I'd META NOINDEX those pages. Robots.txt isn't great at removing content that's already been indexed. Admittedly, NOINDEX will take a while to work (virtually any solution will), as Google probably doesn't crawl these pages very often.
Generally, though, the risk of having your index explode with custom search pages is too high for a site like yours (especially post-Panda). I do think blocking those pages somehow is a good bet.
The only exception I would add is if some of the more popular custom searches are getting traffic and/or links. I assume you have a solid internal link structure and other paths to these listings, but if it looks like a few searches (or a few dozen) have attracted traffic and back-links, you'll want to preserve those somehow.
-
Sure, check below and some of the duplication I mean:
Capitalization Duplication
http://yellow.co.nz/yellow+pages/Car+dealer/Auckland+Region
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland+Region
With a few URL parameters
And with location duplication
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland
Let me know if you need any more info!
Cheers
Kim
-
Whats the content look like on the new url? Can you give us an example?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt blocked internal resources Wordpress
Hi all, We've recently migrated a Wordpress website from staging to live, but the robots.txt was deleted. I've created the following new one: User-agent: *
Intermediate & Advanced SEO | | Mat_C
Allow: /
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Allow: /wp-admin/admin-ajax.php However, in the site audit on SemRush, I now get the mention that a lot of pages have issues with blocked internal resources in robots.txt file. These blocked internal resources are all cached and minified css elements: links, images and scripts. Does this mean that Google won't crawl some parts of these pages with blocked resources correctly and thus won't be able to follow these links and index the images? In other words, is this any cause for concern regarding SEO? Of course I can change the robots.txt again, but will urls like https://example.com/wp-content/cache/minify/df983.js end up in the index? Thanks for your thoughts!2 -
Competing URLs
Hi We have a number of blogs that compete with our homepage for some keywords/phrases. The URLs of the blogs contain the keywords/phrases. I would like to re-work the blogs so that they target different keywords that don't compete and are more relevant. Should I change the URLs as I think this is what is mainly causing the issue? If so, should I 301 old URL's to the homepage? For example, say we we're a site that specialised in selling plastic cups. Currently there is a blog with the URL www.mysite.com/plastic-cups that outranks the homepage for _plastic cups. _The blog isn't particularly relevant to plastic cups and the homepage should rank for this term. How should I let Google know that it is the homepage that is most relevant for this term? Thanks
Intermediate & Advanced SEO | | Buffalo_71 -
Best SEO url woocommerce, what to do?
Hi! Today we have our product categories indexed (by misstake) and for one of our desired keywords, a category have the nr 1 rank. By misstake, we didnt set nofollow, noindex on our categories, just tags, archives etc. We are now migrating to from Ithemes Exchange to Woocommerce and ime looking on improving our SEO urls for the categories. For keyword "Key1" we rank with this url: http://site/product-category/Key1. The seo meta title and description where untouched when we launched the site last spring so it doesnt look so good.. The plan is to stripe out product-category and instead ad some description ( i have a newly written text of 95 words, 519 letters without space with they keyword precent 5 times in a natural way ) to that particular category and have the url as following: http://site/key1 and then have a 301 redirect for the old http://site/product-category/Key1. What do you think of this? What shall i consider? on the right track? Grateful for any help! // Jonas
Intermediate & Advanced SEO | | knubbz0 -
Company Blog at a different URL
Ok, I have been doing a lot of work over the past 6 months, disavowing low quality links from spammy directories to our company website, etc. However, my efforts seem to have had a negative, not positive effect. This has brought me back to reconsidering what we are doing as we have lost a good amount of traction on the nationwide Google rankings specifically. Considering our company blog - platinumcctv(dot)net - we have used this blog for a long time to inform customers of new products, software developments and then to provide them links to purchase those components. Last week, I revamped the nearly default wordpress theme to another on a piece of advice. However, someone told me that all of our links should be nofollow, even though it is a company blog because we have many links coming from this domain, and it could be found as spammy. Potato/Potato - But before I start the tedious task of changing every link to no follow on a whim, i searched a lot, but have found no CLEAR substantiation of this. Any ideas? Other recommendations appreciated as well! Platinum-CCTV(dot)com
Intermediate & Advanced SEO | | PTCCTV0 -
How long until my correct url is in the serps?
We changed our website including urls. We setup 301 redirects for our pages. Some of the pages show up as the old url and some the new url. When does that change?
Intermediate & Advanced SEO | | EcommerceSite0 -
Internal Javascript Links
Hi, We have a client who has internal links pointing to some relatively new pages that we asked them to implement. The problem is that instead of using standard HTML links, their developers have used javascript - e.g. javascript:GoTo... The new pages have links from the homepage (among others) and have been live for about 3-4 weeks now - yet are still to be indexed by Google, Bing & Yahoo. Is it possibe that Javascript links are making them difficult to be found? Thanks in advance for any tips.
Intermediate & Advanced SEO | | jasarrow0 -
SEO Strategy for URL Change
I'm working with a company who will likely have to change their URL because of a trademark dispute. They will be able to maintain the new URL for some period but will soon need to drop the existing URL all together. Aside from the usual keyword considerations when choosing a URL, are there any SEO strategies I should consider as we execute this change?
Intermediate & Advanced SEO | | Jon_KS0 -
Google Maps results doesn't show my site url but rather the maps url, why is this?
For several of my clients landing pages that show up in the Maps results the website url has been overwritten by the maps url (maps.google.com). Even though on my places page I have the correct website set up. Does anyone have any idea why they would be doing this and how I can correct it? Thanks kinldy in advance, Aaron. maps-url.png
Intermediate & Advanced SEO | | afranklin0