Stop Google crawling a site at set times
-
Hi All
I know I can use robots.txt to block Google from pages on my site but is there a way to stop Google crawling my site at set times of the day? Or to request that they crawl at other times?
Thanks
Sean
-
Thanks Keri, this gives me a place to start
-
Here's a link to Bing's help page on the tool. http://www.bing.com/webmaster/help/crawl-control-55a30302.
-
Thanks for that Keri, do you know what Bing calls that option , it might give me a starting point for a Google search
Sean
-
I know Bing has a way to do that, but I haven't heard of Google releasing a similar tool.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Subdirectory site / 301 Redirects / Google Search Console
Hi There, I'm a web developer working on an existing WordPress site (Site #1) that has 900 blog posts accessible from this URL structure: www.site-1.com/title-of-the-post We've built a new website for their content (Site #2) and programmatically moved all blog posts to the second website. Here is the URL structure: www.site-1.com/site-2/title-of-the-post Site #1 will remain as a normal company site without a blog, and Site #2 will act as an online content membership platform. The original 900 posts have great link juice that we, of course, would like to maintain. We've already set up 301 redirects that take care of this process. (ie. the original post gets redirected to the same URL slug with '/site-2/' added. My questions: Do you have a recommendation about how to best handle this second website in Google Search Console? Do we submit this second website as an additional property in GSC? (which shares the same top-level-domain as the original) Currently, the sitemap.xml submitted to Google Search Console has all 900 blog posts with the old URLs. Is there any benefit / drawback to submitting another sitemap.xml from the new website which has all the same blog posts at the new URL. Your guidance is greatly appreciated. Thank you.
Intermediate & Advanced SEO | | HimalayanInstitute0 -
Google Is Indexing my 301 Redirects to Other sites
Long story but now i have a few links from my site 301 redirecting to youtube videos or eCommerce stores. They carry a considerable amount of traffic that i benefit from so i can't take them down, and that traffic is people from other websites, so basically i have backlinks from places that i don't own, to my redirect urls (Ex. http://example.com/redirect) My problem is that google is indexing them and doesn't let them go, i have tried blocking that url from robots.txt but google is still indexing it uncrawled, i have also tried allowing google to crawl it and adding noindex from robots.txt, i have tried removing it from GWT but it pops back again after a few days. Any ideas? Thanks!
Intermediate & Advanced SEO | | cuarto7150 -
Google Docs
Hi Mozers, I was wondering what do you guys think about indexing Google Docs files as Documents or Spreadsheets? Can you do that and is it any help if you what to get some content on the firs page of Google. And also can Google see that content and links, because when I deactivate the javascript on chrome I couldn't see anything from the content Thanks
Intermediate & Advanced SEO | | VeeamSoftware0 -
Proper 301 in Place but Old Site Still Indexed In Google
So i have stumbled across an interesting issue with a new SEO client. They just recently launched a new website and implemented a proper 301 redirect strategy at the page level for the new website domain. What is interesting is that the new website is now indexed in Google BUT the old website domain is also still indexed in Google? I even checked the Google Cached date and it shows the new website with a cache date of today. The redirect strategy has been in place for about 30 days. Any thoughts or suggestions on how to get the old domain un-indexed in Google and get all authority passed to the new website?
Intermediate & Advanced SEO | | kchandler0 -
Will Google View Using Google Translate As Duplicate?
If I have a page in English, which exist on 100 other websites, we have a case where my website has duplicate content. What if I use Google Translate to translate the page from English to Japanese, as the only website doing this translation will my page get credit for producing original content? Or, will Google view my page as duplicate content, because Google can tell it is translated from an original English page, which runs on 100+ different websites, since Google Translate is Google's own software?
Intermediate & Advanced SEO | | khi50 -
How is Google crawling and indexing this directory listing?
We have three Directory Listing pages that are being indexed by Google: http://www.ccisolutions.com/StoreFront/jsp/ http://www.ccisolutions.com/StoreFront/jsp/html/ http://www.ccisolutions.com/StoreFront/jsp/pdf/ How and why is Googlebot crawling and indexing these pages? Nothing else links to them (although the /jsp.html/ and /jsp/pdf/ both link back to /jsp/). They aren't disallowed in our robots.txt file and I understand that this could be why. If we add them to our robots.txt file and disallow, will this prevent Googlebot from crawling and indexing those Directory Listing pages without prohibiting them from crawling and indexing the content that resides there which is used to populate pages on our site? Having these pages indexed in Google is causing a myriad of issues, not the least of which is duplicate content. For example, this file <tt>CCI-SALES-STAFF.HTML</tt> (which appears on this Directory Listing referenced above - http://www.ccisolutions.com/StoreFront/jsp/html/) clicks through to this Web page: http://www.ccisolutions.com/StoreFront/jsp/html/CCI-SALES-STAFF.HTML This page is indexed in Google and we don't want it to be. But so is the actual page where we intended the content contained in that file to display: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff As you can see, this results in duplicate content problems. Is there a way to disallow Googlebot from crawling that Directory Listing page, and, provided that we have this URL in our sitemap: http://www.ccisolutions.com/StoreFront/category/meet-our-sales-staff, solve the duplicate content issue as a result? For example: Disallow: /StoreFront/jsp/ Disallow: /StoreFront/jsp/html/ Disallow: /StoreFront/jsp/pdf/ Can we do this without risking blocking Googlebot from content we do want crawled and indexed? Many thanks in advance for any and all help on this one!
Intermediate & Advanced SEO | | danatanseo0 -
What is the best tool to crawl a site with millions of pages?
I want to crawl a site that has so many pages that Xenu and Screaming Frog keep crashing at some point after 200,000 pages. What tools will allow me to crawl a site with millions of pages without crashing?
Intermediate & Advanced SEO | | iCrossing_UK0 -
What is the best practice when a client is setting up multiple sites/domains
I have a client that is creating separate websites to be used for different purposes. What is the best practice here with regards to not looking spammy. i.e. do the domains need to registered with different companies? hosted on different servers, etc? Thanks in advance for your response.
Intermediate & Advanced SEO | | Dan-1718030