On-site Search - Revisited (again, *zZz*)
-
Howdy Moz fans!
Okay so there's a mountain of information out there on the webernet about internal search results... but i'm finding some contradiction and a lot of pre-2014 stuff. Id like to hear some 2016 opinion and specifically around a couple of thoughts of my own, as well as some i've deduced from other sources. For clarity, I work on a large retail site with over 4 million products (product pages), and my predicament is thus - I want Google to be able to find and rank my product pages. Yes, I can link to a number of the best ones by creating well planned links via categorisation, silos, efficient menus etc (done), but can I utilise site search for this purpose?
-
It was my understanding that Google bots don't/can't/won't use a search function... how could it? It's like expeciting it to find your members only area, it can't login! How can it find and index the millions of combinations of search results without typing in "XXXXL underpants" and all the other search combinations? Do I really need to robots.txt my search query parameter? How/why/when would googlebot generate that query parameter?
-
Site Search is B.A.D - I read this everywhere I go, but is it really? I've read - "It eats up all your search quota", "search results have no content and are classed as spam", "results pages have no value"
I want to find a positive SEO output to having a search function on my website, not just try and stifle Mr Googlebot. What I am trying to learn here is what the options are, and what are their outcomes? So far I have -
_Robots.txt - _Remove the search pages from Google
_No Index - _Allow the crawl but don't index the search pages.
_No Follow - _I'm not sure this is even a valid idea, but I picked it up somewhere out there.
_Just leave it alone - _Some of your search results might get ranked and bring traffic in.
It appears that each and every option has it's positive and negative connotations. It'd be great to hear from this here community on their experiences in this practice.
-
-
Hopefully that helps you some I know we ran into a similar situation for a client. Good luck!
-
Great idea! This has triggered a few other thoughts too... cheers Jordan.
-
I would recommend using screaming frog to crawl only product level pages and export them to a csv or excel doc then copy and past your xml sitemap into an excel sheet. Then from there I would clean up the xml sitemap and sort it by product level pages and just compare the two side by side and see what is missing.
The other option would be to go into google webmaster tools or search console and look at Google Index -> index status and then click the advanced tab and just see what is indexed and what all is being blocked by the robots.txt.
-
@jordan & @matt,
I had done this, this was my initial go-to idea and implementation, and I completely agree this is a solution.
I guess I was hoping to answer the question "can Google even use site search?". as this would answer whether the parameter even needs excluding from robots.txt (I suspect they somehow do, as there wouldn't be this much noise about it otherwise).
That leaves the current situation - Does restricting google from searching my internal search results hinder it's ability to find and index my product pages? I'd argue it does, as since implementing this 6 months ago, the site index status has gone from 5.5m to 120k.
However, this could even be a good thing, as it lowers the Googlebot activity requirement, and should focus on the stronger pages... but the holy grail I am trying to achieve here is to get all my products indexed so I can get a few hits a month from each, i'm not trying to get the search results indexed.
-
Agree with Jordan - block the parameter for search in robots.txt and forget it. It won't bring search traffic in, it shouldn't get crawled but if it does, it's always a negative.
-
I cant speak for everyone but generally we like to robots.txt the search pages. I would imagine since you are working on a large retail site you would want to ensure your other pages get indexed properly so I would imagine blocking the search pages with a robots.txt would suffice. I would also look for some common reoccuring searches through the site search to possibly build content around as well.
I hope that helps some.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Two sites with same content
Hi Everyone, I am having two listing websites. Website A&B are marketplaces Website A approx 12k listing pages Website B : approx 2k pages from one specific brand. The entire 2k listings on website B do exist on website A with the same URL structure with just different domain name. Just header and footer change a little bit. But body is same code. The listings of website B are all partner of a specific insurance company. And this insurance company pays me to maintain their website. They also look at the traffic going into this website from organic so I cannot robot block or noindex this website. How can I be as transparent as possible with Google. My idea was to apply a canonical on website B (insurance partner website) to the same corresponding listing from website A. Which would show that the best version of the product page is on website A. So for example :www.websiteb.com/productxxx would have a canonical pointing to : www.websitea.com/productxxxwww.websiteb.com/productyyy would have a canonical pointing to www.websitea.com/productyyyAny thoughts ? Cheers
Intermediate & Advanced SEO | | Evoe0 -
Mobile Site Annotations
Our company has a complex mobile situation, and I'm trying to figure out the best way to implement bidirectional annotations and a mobile sitemap. Our mobile presence consists of three different "types" of mobile pages: Most of our mobile pages are mobile-specific "m." pages where the URL is completely controlled via dynamic parameter paths, rather than static mobile URLs (because of the mobile template we're using). For example: http://m.example.com/?original_path=/directory/subdirectory. We have created vanity 301 redirects for the majority of these pages, that look like http://m.example.com/product that simply redirect to the previous URL. Six one-off mobile pages that do have a static mobile URL, but are separate from the m. site above. These URLs look like http://www.example.com/product.mobile.html Two responsively designed pages with a single URL for both mobile and desktop. My questions are as follows: Mobile sitemap: Should I include all three types of mobile pages in my mobile sitemap? Should I include all the individual dynamic parameter m. URLs like http://m.example.com/?original_path=/directory/subdirectory in the sitemap, or is that against Google's recommendations? Bidirectional Annotations: We are unable to add the rel="canonical" tag to the m. URLs mentioned in section #1 above because we cannot add dynamic tags to the header of the mobile template. We can, however, add them to the .mobile.html pages. For the rel="alternate" tags on the desktop versions, though, is it correct to use the dynamic parameter URLs like http://m.example.com/?original_path=/directory/subdirectory as the mobile version target for the rel="alternate" tag? My initial thought is no, since they're dynamic parameter URLs. Is there even any benefit to doing this if we can't add the bidirectional rel="canonical" on those same m. dynamic URLs? I'd be immensely grateful for any advice! Thank you so much!
Intermediate & Advanced SEO | | Critical_Mass0 -
Organic search data not representative of site Authority, need advice
Hi, I seeking some advice, I have an organic search issue, I would like to figure out if there is any reason why my site www.aatravel.co.za would not be doing well in the rankings? This domain is more powerful than a previous Domain we had, 51 versus 37 according to MOZ, but despite this it is not ranking nearly as well. There are a few things to consider. The domain was owned by us then got taken away about 3 years ago and then 301ed to a completely new site, then it was 404ed for about a year before we got it back, and now we have it back and have populated it with the same data as the less powerful Domain www.aaholidays.co.za. I believe that most of the AA Travel Authority comes from a stronger backlink profile. Why would this now 2 month after we reskinned and converted 301s back not be ranking as highly? Is there an issue with old site structure and google not passing through the 301 link juice from old pages that have links to the new ones(we have 301ed them)? Also I have 301ed the old aaholidays.co.za site to this one as the new home of AA Travel, that organic traffic was at about 8 000 visits a month, and the new site is at about 2 300. Has Google sandboxed the Domain for a certain period of time, or is there something else that may be the matter?
Intermediate & Advanced SEO | | ProsperoDigital0 -
Reindexing a site with www.
We have a site that has a mirror - i.e. www.domain.com and domain.com - there is not redirect both url's work and show pages so basically a site with 2 sets of URLs for each page. We have changed it so the domain.com and all assorted pages 301 redirect to the right URL with www. i.e. domain.com/about 301's to www.domain.com/about In the search engines the domain.com is the site indexed and the only www. page indexed is the homepage. I checked in the robots.txt file and nothing blocking the search engines from indexing both the www. and non www. versions of the site which makes me wonder why did only one version get indexed and how did the clients avoid a duplicate content issue? Secondly is it best to get the search engines to unidex domain.com and resubmit www.domain.com for the full site? We are definately staying with the www.domain.com NOT domain.com so need to find the best way to get the site indexed with www. and remove the non www. Hope that makes sense and look forward to everyone's input.
Intermediate & Advanced SEO | | JohnW-UK0 -
What this site is doing? Does it look like cloaking to you?
Hi here, I was studying our competitors SEO strategies, and I have noticed that one of our major competitors has setup something pretty weird from a SEO stand point for which I would like to know your thoughts about because I can't find a clear explanation for it. Here is the deal: the site is musicnotes.com, and their product pages are located inside the /sheetmusic/ directory, so if you want to see all their product pages indexed on Google, you can just type in Google: site:musicnotes.com inurl:/sheetmusic/ Then you will get about 290,000 indexed pages. No, here is the tricky part: try to click on one of those links, then you will get a 302 redirect to a page that includes a meta "noindex, nofollow" directive. Isn't that pretty weird? Why would they want to "nonidex, nofollow" a page from a 302 redirect? And how in the heck the redirecting page is still in the index?!! And how Google can allow that?! All this sounds weird to me and remind me spammy techniques of the 90s called "cloaking"... what do you think?
Intermediate & Advanced SEO | | fablau0 -
Site Wide Link Situation
Hi- We have clients who are using an e-commerce cart that sits on a separate domain that appears to be providing site wide links to our clients websites. Therefore, would you recommend disallowing the bots to crawl/index these via a robots.txt file, a no follow meta tag on the specific pages the shopping cart links are implemented on or implement no follow links on every shopping cart link? Thanks!
Intermediate & Advanced SEO | | RezStream80 -
Should product searches (on site searches) be noindex?
We have a large new site that is suffering from a sitewide panda like penalty. The site has 200k pages indexed by Google. Lots of category and sub category page content and about 25% of the product pages have unique content hand written (vs the other pages using copied content). So it seems our site is labeled as thin. I'm wondering about using noindex paramaters for the internal site search. We have a canonical tag on search results pointing to domain.com/search/ (client thought that would help) but I'm wondering if we need to just no index all the product search results. Thoughts?
Intermediate & Advanced SEO | | iAnalyst.com0 -
How would you fix this site?
We're currently in the IA and design phase of rolling out a complete overhaul of our main site. In the meantime I've been doing some SEO triage, but I wanted to start making a longer term plan for SEO during and after the new site goes up. We have a pretty decent domain authority, and some quality backlinks, but we're just getting creamed in the SERPs. And so on to my question: How would you fix this site? What SEO strategy would you employ? http://www.adoptionhelp.org Thanks!
Intermediate & Advanced SEO | | AdoptionHelp0