Block search engines from URLs created by internal search engine?
-
Hey guys,
I've got a question for you all that I've been pondering for a few days now. I'm currently doing an SEO Technical Audit for a large scale directory.
One major issue that they are having is that their internal search system (Directory Search) will create a new URL everytime a search query is entered by the user. This creates huge amounts of duplication on the website.
I'm wondering if it would be best to block search engines from crawling these URLs entirely with Robots.txt?
What do you guys think? Bearing in mind there are probably thousands of these pages already in the Google index?
Thanks
Kim
-
That sounds perfect - if the user-generated URLs are getting enough traffic, make them permanent pages and 301-redirect or canonical. If not, weed them out of the index.
-
Thanks for your reply Dr. Meyers. I think you're probably right.
Yes I'm recommending they define a canonical set of pages that are the most popular searches, categories and locations which can be reached via internal links and we'll get all those duplicates re-directed back to that canonical set.
But for pages that fall outside those categories and locations, I'll recommend a meta-no-index tag.
-
It can be a complicated question on a very large site, but in most cases I'd META NOINDEX those pages. Robots.txt isn't great at removing content that's already been indexed. Admittedly, NOINDEX will take a while to work (virtually any solution will), as Google probably doesn't crawl these pages very often.
Generally, though, the risk of having your index explode with custom search pages is too high for a site like yours (especially post-Panda). I do think blocking those pages somehow is a good bet.
The only exception I would add is if some of the more popular custom searches are getting traffic and/or links. I assume you have a solid internal link structure and other paths to these listings, but if it looks like a few searches (or a few dozen) have attracted traffic and back-links, you'll want to preserve those somehow.
-
Sure, check below and some of the duplication I mean:
Capitalization Duplication
http://yellow.co.nz/yellow+pages/Car+dealer/Auckland+Region
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland+Region
With a few URL parameters
And with location duplication
http://yellow.co.nz/yellow+pages/Car+Dealer/Auckland
Let me know if you need any more info!
Cheers
Kim
-
Whats the content look like on the new url? Can you give us an example?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404 vs 410 Across Search Engines
We are removing a large number of URLs permanently. We care about rankings for search engines other than Google such as Yahoo-Bing, who don't even list https status 410 code option: https://docs.microsoft.com/en-us/bingmaps/spatial-data-services/status-codes-and-error-handling Does anyone know how search engines other than Google handle 410 vs 404 status? For pages permanently being removed John Mueller at Google has stated "From our point of view, in the mid term/long term, a 404 is the same as a 410 for us. So in both of these cases, we drop those URLs from our index. We generally reduce crawling a little bit of those URLs so that we don’t spend too much time crawling things that we know don’t exist. The subtle difference here is that a 410 will sometimes fall out a little bit faster than a 404. But usually, we’re talking on the order of a couple days or so. So if you’re just removing content naturally, then that’s perfectly fine to use either one." Any information or thoughts? Thanks
Intermediate & Advanced SEO | | sb10300 -
What is the Redirect Rule for corresponding https urls to new domain with the same https urls?
2 sites have the same urls but the owner wants just the 1 site. So I will be doing a 301 redirect with .htaccess from https://www.example.co.uk/sportsbook/SOCCER/today/ redirecting to https://www.example.com//sportsbook/SOCCER/today/ There are a lot of urls that are the same, so I was wondering what the rule is to put in the file please that will change them all to the corresponding urls? Would this be correct?... RewriteEngine on
Intermediate & Advanced SEO | | WSIDW
RewriteCond %{HTTPS_HOST} ^example.co.uk [NC,OR]
RewriteCond %{HTTPS_HOST} ^www.example.co.uk [NC]
RewriteRule ^(.*)$ https://example.com$1 [L,R=301,NC] Or would a simple rule like this work... redirect 301 / http://www.new domain.com/ If not correct could you please give me the correct rule, thanks! Then of course doing a change of address of address in webmaster tools after. Also... do I still need to do the forwarding from the https://www.example.co.uk/ domain provider after as well? Many thanks for your help in advance.0 -
Does the url in for your homepage impact SEO
Is there any harm to SEO having a homepage url that is not clean like www.domain.com. For example citi uses https://online.citi.com/US/login.do Does that matter in any way? Would a company like citi benefit from changing to www.citi.com as their homepage?
Intermediate & Advanced SEO | | kcb81781 -
Internal Links - Different URLs
Hey so, In my product page, I have recommended products at the bottom. The issue is that those recommended products have long parameters such as sitename.com/product-xy-z/https%3A%2F%2Fwww.google.co&srcType=dp_recs The reason why it has that long parameter is due to tracking purposes (internally with the dev and UX team). My question is, should I replace it with the clean URL or as long as it has the canonical tag, it should be okay to have such a long parameter? I would think clean URL would help with internal links and what not...but if it already has a canonical tag would it help? Another issue is that the URL is different and not just the parameter. For instance..the canonical URL is sitename.com/productname-xyz/ and so the internal link used on the product page (same exact page just different URL with parameter) sitename.com/xyz/https%3A%2F%2Fwww.google.co&srcType=dp_recs (missing product name), BUT still has the canonical tag!
Intermediate & Advanced SEO | | ggpaul5620 -
Product or Shop in URL
What do you think is better for seo and for sale, I am using woo-ecommerce for health products website. websitename.com/product/keyword OR websitename.com/shop/keyword
Intermediate & Advanced SEO | | MasonBaker0 -
Internal Search Results Appear in Google SERPS
My friend is running an ecommerce store selling apparels. How can we make internal search results to appear in Google SERPS and rank them? For example: the query is "peplum dress". You type the query into the internal search box and it returns a set of results. In this case, it's product listing. How can we optimize and rank it so it appears in Google SERP? Do we do it the traditional way in terms of links? Say URL is: http://www.asos.com/search/peplum-top?q=peplum+top&r=2 And we build links to it? Some of you may ask why not create a dedicated page for this, the reason being we'd have too many categories if we were to create one for each. Thoughts?
Intermediate & Advanced SEO | | WayneRooney0 -
Does Prefix of my URL make any difference?
Hello, I have a website which is initially appeared in search engine as without www. Last week I made changes in preferred domain name that it appeared with www. In search engine it still shows as without www. I notified to google through webmaster tools that now my domain name is with www but it still shows without www. I want to know that does it affect in SEO and rankings. In Google webmaster tools I added my url with and without www however I kept preferred domain as with www. Do I need to make any extra changes in order to avoid confusion for search engines. Please guide. Thanks
Intermediate & Advanced SEO | | intmktcom0 -
URL for offline purposes
Hi there, We are going to be promoting one of our products offline, however I do not want to use the original URL for this product page as it's long for the user to type in, so I thought it would be best practice in using a URL that would be short, easier for the consumer to remember. My plan: Replicate the product page and put it on this new short URL, however this would mean I have a duplicate content issue, would It be best practice to use a canonical on the new short URL pointing to the original URL? or use a 301? Thanks for any help
Intermediate & Advanced SEO | | Paul780