Does Google crawl the pages which are generated via the site's search box queries?
-
For example, if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled? I am asking this question because this would be a URL that is non existent on the site and hence am confused as to whether Google bots would be able to find it.
-
Google crawls the pages. Google sometimes even tries the search box by typing in a random word to see what happens.
Have a look at this url: https://www.google.com/search?q=site%3Agoogle.com%20inurl%3A%22search%3Fq%22 You'll see that search queries have been indexed. It happens if people link to it. It doesn't matter that the url is "non existent". It actually is, because it doesn't display a 404 error or any other error. It gives a 200 OK code, so search engines see it as a normal page. Google will probably not index a page it "makes" by filling in a random search term, but will index such a page when it is linked to.
-
Google could crawl the dynamic URLs created by your searchbox - but it usually doesn't unless there is a link to such a dynamic url somewhere. Internal searches don't create much problems anymore, but if you want to be sure, you could always block your dynamic search results pages via robots.txt or Google Webmaster Tools (>Site configuration >URL parameters).
So if the URL generated by internal searches is http://www.site.com/search/?searchword=search+query+here, you could add this to robots.txt:
User-agent: *
Disallow: /search/
-
No I am not talking about the google search box incorporated in sites but the site's own search box. Answering your 2nd question, I meant that the URL won't be found via site navigation as it is a dynamic URL that is generated. Awaiting your response on the same.
-
"if I search for an 'x' item in a site's search box and if the site displays a list of results based on the query, would that page be crawled?"
Is it google search-box for sites your talking about?" I am asking this question because this would be a URL that is non existent on the site"
**If it does'nt exist you wouldent find it? or do you mean that the page is not to find in the menu-navigation? **
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Our new site will be using static site generator which is supposed to be better for SEO?
Hi folks, Our dev team is planning on building our new marketing webpages on SSG or Static Site Generator(we are stepping away from SSR). Based on my research this is something that can help our SEO in particular for site speed (our site has a poor score).
Intermediate & Advanced SEO | | TyEl
Are there any challenges or concerns I should be aware regarding this direction? If so what are they and how can this be addressed? Thanks0 -
I need help on how best to do a complicated site migration. Replacing certain pages with all new content and tools, and keeping the same URL's. The rest just need to disappear safely. Somehow.
I'm completely rebranding a website but keeping the same domain. All content will be replaced and it will use a different theme and mostly new plugins. I've been building the new site as a different site in Dev mode on WPEngine. This means it currently has a made-up domain that needs to replace the current site. I know I need to somehow redirect the content from the old version of the site. But I'm never going to use that content again. (I could transfer it to be a Dev site for the current domain and automatically replace it with the click of a button - just as another option.) What's the best way to replace blahblah.com with a completely new blahblah.com if I'm not using any of the old content? There are only about 4 URL'st, such as blahblah.com/contact hat will remain the same - with all content replaced. There are about 100 URL's that will no longer be in use or have any part of them ever used again. Can this be done safely?
Intermediate & Advanced SEO | | brickbatmove1 -
Change Google's version of Canonical link
Hi My website has millions of URLs and some of the URLs have duplicate versions. We did not set canonical all these years. Now we wanted to implement it and fix all the technical SEO issues. I wanted to consolidate and redirect all the variations of a URL to the highest pageview version and use that as the canonical because all of these variations have the same content. While doing this, I found in Google search console that Google has already selected another variation of URL as canonical and not the highest pageview version. My questions: I have millions of URLs for which I have to do 301 and set canonical. How can I find all the canonical URLs that Google has autoselected? Search Console has a daily quota of 100 or something. Is it possible to override Google's version of Canonical? Meaning, if I set a variation as Canonical and it is different than what Google has already selected, will it change overtime in Search Console? Should I just do a 301 to highest pageview variation of the URL and not set canonicals at all? This way the canonical that Google auto selected might get redirected to the highest pageview variation of the URL. Any advice or help would be greatly appreciated.
Intermediate & Advanced SEO | | SDCMarketing0 -
Site Structure - Is it ok to Keep current flat architecture of existing site pages and use silo structure on two new categories only?
Hi there, I have a site structure flat like this it ranks quite well for its niche site.com/red-apples.html site.com/blue-apples.html The site is branching out into a new but related lines of business is it ok to keep existing site architecture as above while using a silo structure just for the two new different but related business? site.com/meat/red-meat.html site.com/fish/oceant-trout.html Thanks for any advice!
Intermediate & Advanced SEO | | servetea0 -
Webpage has bombed outside of Top 50 for search term in one week. What's the cause?
I've been monitoring the performance of some pages via the email Moz sends every week, and until this week two pages that I've managed to get ranking have ranked between 20 and 23 for the specific term. However, today on the email one of the pages for one search term has bombed out of the top 50 while the other page has remained unaffected. What could be the cause for this? I've looked at Google Webmasters for an indication of a penalty of some sort but there is nothing glaringly obvious. I've no messages on there, and I haven't bought a load of spam links at all. What else could I check?
Intermediate & Advanced SEO | | mickburkesnr0 -
Stop Google crawling a site at set times
Hi All I know I can use robots.txt to block Google from pages on my site but is there a way to stop Google crawling my site at set times of the day? Or to request that they crawl at other times? Thanks Sean
Intermediate & Advanced SEO | | ske110 -
Wordpress.com content feeding into site's subdomain, who gets SEO credit?
I have a client who had created a Wordpress.com (not Wordpress.org) blog, and feeds blog posts into a subdomain blog.client-site.com. My understanding was that in terms of SEO, Wordpress.com would still get the credit for these posts, and not the client, but I'm seeing conflicting information. All of the posts are set with permalinks on the client's site, such as blog.client-site.com/name-of-post, and when I run a Google site:search query, all of those individual posts appear in the Google search listings for the client's domain. Also, I've run a marketing.grader.com report, and these same results are seen. Looking at the source code on the page, however, I see this information which leads me to believe the content is being credited to, and fed in from, Wordpress.com ('client name' altered for privacy): href="http://client-name.files.wordpress.com/2012/08/could_you_survive_a_computer_disaster.jpeg">class="alignleft size-thumbnail wp-image-2050" title="Could_you_survive_a_computer_disaster" src="http://client-name.files.wordpress.com/2012/08/could_you_survive_a_computer_disaster.jpeg?w=150&h=143" I'm looking to provide a recommendation to the client on whether they are ok to continue moving forward with this current setup, or whether we should port the blog posts over to a subfolder on their primary domain www.client-site.com/blog and use Wordpress.org functionality, for proper SEO. Any advice?? Thank you!
Intermediate & Advanced SEO | | grapevinemktg0 -
How to get around Google Removal tool not removing redirected and 404 pages? Or if you don't know the anchor text?
Hello! I can’t get squat for an answer in GWT forums. Should have brought this problem here first… The Google Removal Tool doesn't work when the original page you're trying to get recached redirects to another site. Google still reads the site as being okay, so there is no way for me to get the cache reset since I don't what text was previously on the page. For example: This: | http://0creditbalancetransfer.com/article375451_influencial_search_results_for_.htm | Redirects to this: http://abacusmortgageloans.com/GuaranteedPersonaLoanCKBK.htm?hop=duc01996 I don't even know what was on the first page. And when it redirects, I have no way of telling Google to recache the page. It's almost as if the site got deindexed, and they put in a redirect. Then there is crap like this: http://aniga.x90x.net/index.php?q=Recuperacion+Discos+Fujitsu+www.articulo.org/articulo/182/recuperacion_de_disco_duro_recuperar_datos_discos_duros_ii.html No links to my site are on there, yet Google's indexed links say that the page is linking to me. It isn't, but because I don't know HOW the page changed text-wise, I can't get the page recached. The tool also doesn't work when a page 404s. Google still reads the page as being active, but it isn't. What are my options? I literally have hundreds of such URLs. Thanks!
Intermediate & Advanced SEO | | SeanGodier0