Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Would you rate-control Googlebot? How much crawling is too much crawling?
-
One of our sites is very large - over 500M pages. Google has indexed 1/8th of the site - and they tend to crawl between 800k and 1M pages per day.
A few times a year, Google will significantly increase their crawl rate - overnight hitting 2M pages per day or more. This creates big problems for us, because at 1M pages per day Google is consuming 70% of our API capacity, and the API overall is at 90% capacity. At 2M pages per day, 20% of our page requests are 500 errors.
I've lobbied for an investment / overhaul of the API configuration to allow for more Google bandwidth without compromising user experience. My tech team counters that it's a wasted investment - as Google will crawl to our capacity whatever that capacity is.
Questions to Enterprise SEOs:
*Is there any validity to the tech team's claim? I thought Google's crawl rate was based on a combination of PageRank and the frequency of page updates. This indicates there is some upper limit - which we perhaps haven't reached - but which would stabilize once reached.
*We've asked Google to rate-limit our crawl rate in the past. Is that harmful? I've always looked at a robust crawl rate as a good problem to have.
- Is 1.5M Googlebot API calls a day desirable, or something any reasonable Enterprise SEO would seek to throttle back?
*What about setting a longer refresh rate in the sitemaps? Would that reduce the daily crawl demand? We could set increase it to a month, but at 500M pages Google could still have a ball at the 2M pages/day rate.
Thanks
-
I agree with Matt that there can probably be a reduction of pages, but that aside, how much of an issue this is comes down to what pages aren't being indexed. It's hard to advise without the site, are you able to share the domain? If the site has been around for a long time, that seems a low level of indexation. Is this a site where the age of the content matters? For example Craigslist?
Craig
-
Thanks for your response. I get where you're going with that. (Ecomm store gone bad.) It's not actually an Ecomm FWIW. And I do restrict parameters - the list is about a page and a half long. It's a legitimately large site.
You're correct - I don't want Google to crawl the full 500M. But I do want them to crawl 100M. At the current crawl rate we limit them to, it's going to take Google more than 3 months to get to each page a single time. I'd actually like to let them crawl 3M pages a day. Is that an insane amount of Googlebot bandwidth? Does anyone else have a similar situation?
-
Gosh, that's a HUGE site. Are you having Google crawl parameter pages with that? If so, that's a bigger issue.
I can't imagine the crawl issues with 500M pages. A site:amazon.com search only returns 200M. Ebay.com returns 800M so your site is somewhere in between these two? (I understand both probably have a lot more - but not returning as indexed.)
You always WANT a full site crawl - but your techs do have a point. Unless there's an absolutely necessary reason to have 500M indexed pages, I'd also seek to cut that to what you want indexed. That sounds like a nightmare ecommerce store gone bad.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How Much Domain Age Matter In Ranking?
I am very confused about domain age. I read many articles about domain age, some experts say domain age does matter in ranking and some experts say it doesn't matter in the ranking. Kindly guide me about domain age.
Intermediate & Advanced SEO | | MuhammadQasimAttari0 -
Googlebot being redirected but not users?
Hi, We seem to have a slightly odd issue. We noticed that a number of our location category pages were slipping off 1 page, and onto page 2 in our niche. On inspection, we noticed that our Arizona page had started ranking in place of a number of other location pages - Cali, Idaho, NJ etc. Weirdly, the pages they had replaced were no longer indexed, and would remain so, despite being fetched, tweeted etc. One test was to see when the dropped out pages had been last crawled, or at least cached. When conducting the 'cache:domain.com/category/location' on these pages, we were getting 301 redirected to, you guessed it, the Arizona page. Very odd. However, the dropped out pages were serving 200 OK when run through header checker tools, screaming frog etc. On the face of it, it would seem Googlebot is getting redirected when it is hitting a number of our key location pages, but users are not. Has anyone experienced anything like this? The theming of the pages are quite different in terms of content, meta etc. Thanks.
Intermediate & Advanced SEO | | Sayers0 -
Google Adsbot crawling order confirmation pages?
Hi, We have had roughly 1000+ requests per 24 hours from Google-adsbot to our confirmation pages. This generates an error as the confirmation page cannot be viewed after closing or by anyone who didn't complete the order. How is google-adsbot finding pages to crawl that are not linked to anywhere on the site, in the sitemap or linked to anywhere else? Is there any harm in a google crawler receiving a higher percentage of errors - even though the pages are not supposed to be requested. Is there anything we can do to prevent the errors for the benefit of our network team and what are the possible risks of any measures we can take? This bot seems to be for evaluating the quality of landing pages used in for Adwords so why is it trying to access confirmation pages when they have not been set for any of our adverts? We included "Disallow: /confirmation" in the robots.txt but it has continued to request these pages, generating a 403 page and an error in the log files so it seems Adsbot doesn't follow robots.txt. Thanks in advance for any help, Sam
Intermediate & Advanced SEO | | seoeuroflorist0 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
Cache-Control max-age=3, must-revalidate
Good morning, I am using wp super cache and I got the report: Cache-Control max-age=3, must-revalidate Any idea how to fix this? Thank you very much for any advice. My htacces file look like this below: EXPIRES CACHING <ifmodule mod_expires.c="">ExpiresActive On
Intermediate & Advanced SEO | | Rebeca1
ExpiresByType image/jpg "access plus 1 week"
ExpiresByType image/jpeg "access plus 1 week"
ExpiresByType image/gif "access plus 1 week"
ExpiresByType image/png "access plus 1 week"
ExpiresByType text/css "access plus 1 week"
ExpiresByType application/pdf "access plus 1 week"
ExpiresByType text/x-javascript "access plus 1 week"
ExpiresByType application/x-shockwave-flash "access plus 1 month"
ExpiresByType image/x-icon "access plus 1 week"
ExpiresDefault "access plus 2 days"</ifmodule> EXPIRES CACHING BEGIN s2Member GZIP exclusions Redirect 301 /2012/03/22/romantic-couples-getaway-on-diani-beach/ http://www.villasdiani.com/rent/alfajiri-beach-villa/
Redirect 301 /luxury-beach-holidays-in-kenya/ http://www.villasdiani.com/
Redirect 301 /diani-beach-family-villa/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rentals/ocean-view-villas/ http://www.villasdiani.com/rent/alfajiri-beach-villa/
Redirect 301 /alfajiri-garden-villa/alfajiri-cliff-villa-diani-beach-4-2/feed/ http://www.villasdiani.com/rent/alfajiri-beach-villa/
Redirect 301 /kenyas-guide-highlights-activities/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /2012/06/25/safar-activities-tours-in-from-diani-beach/ http://www.villasdiani.com/category/diani-beach-restaurants-bars/
Redirect 301 /afrochic-special-offer/ http://www.villasdiani.com/rent/afrochic-boutique-hotel/
Redirect 301 /rent/african-beach-cottages/ http://www.villasdiani.com/rentals/boutique-hotels/
Redirect 301 /2012/07/11/boutique-hotels-in-kenya/ http://www.villasdiani.com/rentals/boutique-hotels/
Redirect 301 /tag/north-coast-accommodation/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /tag/coast-weather/ http://www.villasdiani.com/kenya-coast/
Redirect 301 /category/top-destination-guide/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /watamu-tembo-village-restaurant/ http://www.villasdiani.com/category/diani-beach-restaurants-bars/
Redirect 301 /kikambala-beach/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /kilifi-bofa-maweni-beach/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /rent/exclusive-beachfront-holiday-villa-diani-beach/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /news/ http://www.villasdiani.com/category/kenya-news/
Redirect 301 /rent/galu-beach-beachfront-cottages/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /wp-content/uploads/2012/05/Diani-beach-ocean-view-300x200.jpg http://www.villasdiani.com/diani-beach/
Redirect 301 /wp-content/uploads/2012/05/Diani-beach-ocean-view.jpg http://www.villasdiani.com/diani-beach/
Redirect 301 /feature/ocean-view/ http://www.villasdiani.com/rentals/ocean-view-villas/
Redirect 301 /rent/forodhani-house/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/al-hamra-residence/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/lonno-lodge-watamu/ http://www.villasdiani.com/rentals/boutique-hotels/
Redirect 301 /rent/blue-bay-cove-watamu/ http://www.villasdiani.com/rentals/boutique-hotels/
Redirect 301 /rent/msambweni-beach-house/ http://www.villasdiani.com/rentals/boutique-hotels/
Redirect 301 /rent/diamonds-dream-of-africa/ http://www.villasdiani.com/rentals/boutique-hotels/
Redirect 301 /rent/papaya-garden/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /category/activities/ http://www.villasdiani.com/category/diani-beach-safaris-tours/
Redirect 301 /category/restaurants-and-nightclubs-diani-beach/ http://www.villasdiani.com/category/diani-beach-restaurants-bars/
Redirect 301 /category/restaurants-nightclubs/ http://www.villasdiani.com/category/diani-beach-restaurants-bars/
Redirect 301 /diani-beach-hospital/ http://www.villasdiani.com/diani-beach-hospitals/
Redirect 301 /health-care-facility-diani/ http://www.villasdiani.com/diani-beach-hospitals/
Redirect 301 /palm-beach-hospital/ http://www.villasdiani.com/diani-beach-hospitals/
Redirect 301 /category/news/kenyas-beaches/page/2/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /2011/09/27/unique-private-palatial-diani-beach-villa/ http://www.villasdiani.com/rent/presidential-villa/
Redirect 301 /kenya-coast-weather-forecast/ http://www.villasdiani.com/kenya-coast/
Redirect 301 /rent/beachfront-tropical-paradise-accommodation/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rentals/villa-accommodation/page/3/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /category/restaurants-and-nightclubs/page/2/ http://www.villasdiani.com/category/diani-beach-restaurants-bars/
Redirect 301 /rentals/villa-accommodation/page/4/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rentals/villa-accommodation/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /free-night-afrochic/ http://www.villasdiani.com/rent/afrochic-boutique-hotel/
Redirect 301 /flamboyant-diani-special-offer/ http://www.villasdiani.com/rent/flamboyant-boutique-hotel/
Redirect 301 /restaurants-night-clubs-diani-beach/ http://www.villasdiani.com/category/diani-beach-restaurants-bars/
Redirect 301 /category/beaches-kenya/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /rent/luxury-beach-cottage-firimbi/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /safaris/ http://www.villasdiani.com/rentals/safari-villas/
Redirect 301 /sitemap/ http://www.villasdiani.com/sitemap_index.xml
Redirect 301 /holiday-accommodation-and-lodging-in-kenya/ http://www.villasdiani.com/holiday-accommodation-kenya/
Redirect 301 /beach-holidays-to-mombasa/ http://www.villasdiani.com/mombasa/
Redirect 301 /wp-content/uploads/2012/07/luxury-holiday-villa-300x200.jpg http://www.villasdiani.com/kenya-holidays/
Redirect 301 /wp-content/uploads/2012/07/beach-family1-300x200.jpg http://www.villasdiani.com/holiday-accommodation-kenya/
Redirect 301 /kenya-luxury-holidays/ http://www.villasdiani.com/kenya-holidays/
Redirect 301 /rent/flamboyant-diani-hotel/ http://www.villasdiani.com/rent/flamboyant-boutique-hotel/
Redirect 301 /rent/galu-beach-kenya-luxury-boutique-hotel/ http://www.villasdiani.com/rent/galu-beach-hotel/
Redirect 301 /rent/romantic-couples-getaway-on-diani-beach/ http://www.villasdiani.com/rent/spice-of-the-coast/
Redirect 301 /rent/diani-beach-beachfront-boutique-resort/ http://www.villasdiani.com/rent/waterlovers-beach-resort/
Redirect 301 /rent/galu-beach-tropical-paradise-villa/ http://www.villasdiani.com/rent/paradise-villas/
Redirect 301 /rent/galu-beach-beachfront-cottages/ http://www.villasdiani.com/property-type/self-catering/
Redirect 301 /rentals/diani-beach-holiday-villas/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rentals/private-villas/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/diani-beach-villa/ http://www.villasdiani.com/rent/sofia-house/
Redirect 301 /rent/diani-beach-family-villa/ http://www.villasdiani.com/rent/satis-house/
Redirect 301 /rent/diani-beach-zanzibar-style-villa/ http://www.villasdiani.com/rent/taj-riviera/
Redirect 301 /rent/diani-beach-majestic-arabian-villa/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/diani-beach-congo-river-villa/ http://www.villasdiani.com/rent/congo-river-house/
Redirect 301 /rent/galu-beach-cottage/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/diani-beach-central-villa/ http://www.villasdiani.com/rent/cinders-holiday-home/
Redirect 301 /rent/beachfront-villa-apartment-resort/ http://www.villasdiani.com/rent/lantana-galu-beach-resort/
Redirect 301 /rent/diani-beach-villa-hotel/ http://www.villasdiani.com/rent/afrochic-boutique-hotel/
Redirect 301 /rent/exotic-tree-villas-private-beach/ http://www.villasdiani.com/rent/cove-retreat/
Redirect 301 /rent/luxury-watamu-villa-yin-yang/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/lamu-luxury-beach-villa/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/watamu-exclusive-residence-al-hamra/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /rent/diani-beach-luxury-baobab-villas/ http://www.villasdiani.com/rent/adansonia-villas-resort/
Redirect 301 /rent/diani-beach-palatial-villa/ http://www.villasdiani.com/rent/presidential-villa/
Redirect 301 /rent/entiwi-beach-exclusive-beach-villa/ http://villasdiani.com/rent/baobab-house/
Redirect 301 /rent/diani-beach-luxury-beachfront-holiday-home/ http://www.villasdiani.com/rent/ocean-view-villa/
Redirect 301 /rent/exclusive-holiday-beach-villa-diani-beach/ http://www.villasdiani.com/rent/niros-paradise/
Redirect 301 /rent/luxury-beachfront-holiday-villa-diani-beach/ http://www.villasdiani.com/rent/niros-place/
Redirect 301 /rent/7-bedroom-diani-beach-beachfront-villa/ http://www.villasdiani.com/rent/watano-house/
Redirect 301 /rent/luxurious-diani-beach-villa-resort/ http://www.villasdiani.com/rent/almanara-beach-resort/
Redirect 301 /rent/tiwi-beachoceanfront-villa/ http://www.villasdiani.com/rent/waterside-villa/
Redirect 301 /rent/diani-beach-galu-beach-villa/ http://www.villasdiani.com/rent/sunset-villa/
Redirect 301 /wp-content/uploads/2012/06/vilan-300x225.jpg http://www.villasdiani.com/kitesurfing-windsurfing/
Redirect 301 /10-villa-cottage-diani-beach/ http://www.villasdiani.com/rentals/beach-villas/
Redirect 301 /small-luxury-hotels-kenya/ http://www.villasdiani.com/rentals/boutique-hotels/
Redirect 301 /beach-holidays-kenya/ http://www.villasdiani.com/kenya-beach-holidays/
Redirect 301 /baboons-in-diani-beach/ http://www.villasdiani.com/baby-baboon-video/
Redirect 301 /safaris-kenya/ http://www.villasdiani.com/national-parks/
Redirect 301 /diani-beach-big-game-fishing-deep-sea-fishing/ http://www.villasdiani.com/sport-fishing/
Redirect 301 /funzi-island-visit-diani-beach-sundowner-cruise/ http://www.villasdiani.com/sundowner-cruise/
Redirect 301 /diani-beach-beachfront-luxury-cottage/ http://www.villasdiani.com/rent/summer-villa-colobus/
Redirect 301 /highlights-activities-places-kenya/ http://www.villasdiani.com/safari-tours/
Redirect 301 /eco-bike-cultural-tour-on-diani-beach/ http://www.villasdiani.com/diani-bikes/
Redirect 301 /wasini-island-visit/ http://www.villasdiani.com/wasini-island/
Redirect 301 /diani-maasai-mara/ http://www.villasdiani.com/safari-beach-holidays-kenya/
Redirect 301 /geography-kenya-map-kenya/ http://www.villasdiani.com/where-is-kenya/
Redirect 301 /book-luxury-holiday-accommodation/ http://www.villasdiani.com/about-us-why-booking-your-holiday-with-us/
Redirect 301 /kitesurfing-windsurfing-in-diani-beach/ http://www.villasdiani.com/kitesurfing-windsurfing/
Redirect 301 /diani-beach-ali-barbours-cave-restaurant/ http://www.villasdiani.com/cave-restaurant/
Redirect 301 /diani-beach-forty-thieves-beach-bar/ http://www.villasdiani.com/forty-thieves-bar/
Redirect 301 /exotic-luxury-holiday-packages/ http://www.villasdiani.com/safari-beach-holidays-kenya/
Redirect 301 /category/news/ http://www.villasdiani.com/category/kenya-news/
Redirect 301 /category/kenya/ http://www.villasdiani.com/category/kenya-news/
Redirect 301 /vipingo-beach-bureni-beach/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /category/diani-beach/ http://www.villasdiani.com/diani-beach/
Redirect 301 /category/beaches/ http://www.villasdiani.com/beaches-in-kenya/
Redirect 301 /diani-beach-aniellos-restaurant/ http://www.villasdiani.com/aniello-restaurante/
Redirect 301 /diani-beach-shopping-areas-supermarkets-grocery-shops-local-markets/ http://www.villasdiani.com/diani-beach-shopping-local-markets/
RedirectMatch 301 ^/([0-9]{4})/([0-9]{2})/([0-9]{2})/(.*)$ http://villasdiani.com/$4 <ifmodule mod_rewrite.c="">RewriteEngine On RewriteBase / RewriteCond %{QUERY_STRING} (^|?|&)s2member_file_download=.+ RewriteRule .* - [E=no-gzip:1]</ifmodule> END s2Member GZIP exclusions BEGIN WPSuperCache <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
#If you serve pages from behind a proxy you may want to change 'RewriteCond %{HTTPS} on' to something more sensible
AddDefaultCharset UTF-8
RewriteCond %{REQUEST_URI} !^.[^/]$
RewriteCond %{REQUEST_URI} !^.//.$
RewriteCond %{REQUEST_METHOD} !POST
RewriteCond %{QUERY_STRING} !.=.*
RewriteCond %{HTTP:Cookie} !^.(comment_author_|wordpress_logged_in|wp-postpass_).$
RewriteCond %{HTTP:X-Wap-Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP:Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP_USER_AGENT} !^.(2.0\ MMP|240x320|400X240|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|Googlebot-Mobile|hiptop|IEMobile|KYOCERA/WX310K|LG/U990|MIDP-2.|MMEF20|MOT-V|NetFront|Newt|Nintendo\ Wii|Nitro|Nokia|Opera\ Mini|Palm|PlayStation\ Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|SHG-i900|Small|SonyEricsson|Symbian\ OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|webOS|Windows\ CE|WinWAP|YahooSeeker/M1A1-R2D2|iPhone|iPod|Android|BlackBerry9530|LG-TU915\ Obigo|LGE\ VX|webOS|Nokia5800). [NC]
RewriteCond %{HTTP_user_agent} !^(w3c\ |w3c-|acs-|alav|alca|amoi|audi|avan|benq|bird|blac|blaz|brew|cell|cldc|cmd-|dang|doco|eric|hipt|htc_|inno|ipaq|ipod|jigs|kddi|keji|leno|lg-c|lg-d|lg-g|lge-|lg/u|maui|maxo|midp|mits|mmef|mobi|mot-|moto|mwbp|nec-|newt|noki|palm|pana|pant|phil|play|port|prox|qwap|sage|sams|sany|sch-|sec-|send|seri|sgh-|shar|sie-|siem|smal|smar|sony|sph-|symb|t-mo|teli|tim-|tosh|tsm-|upg1|upsi|vk-v|voda|wap-|wapa|wapi|wapp|wapr|webc|winw|winw|xda\ |xda-).* [NC]
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{HTTPS} on
RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{SERVER_NAME}/$1/index-https.html.gz -f
RewriteRule ^(.*) "/wp-content/cache/supercache/%{SERVER_NAME}/$1/index-https.html.gz" [L] RewriteCond %{REQUEST_URI} !^.[^/]$
RewriteCond %{REQUEST_URI} !^.//.$
RewriteCond %{REQUEST_METHOD} !POST
RewriteCond %{QUERY_STRING} !.=.*
RewriteCond %{HTTP:Cookie} !^.(comment_author_|wordpress_logged_in|wp-postpass_).$
RewriteCond %{HTTP:X-Wap-Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP:Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP_USER_AGENT} !^.(2.0\ MMP|240x320|400X240|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|Googlebot-Mobile|hiptop|IEMobile|KYOCERA/WX310K|LG/U990|MIDP-2.|MMEF20|MOT-V|NetFront|Newt|Nintendo\ Wii|Nitro|Nokia|Opera\ Mini|Palm|PlayStation\ Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|SHG-i900|Small|SonyEricsson|Symbian\ OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|webOS|Windows\ CE|WinWAP|YahooSeeker/M1A1-R2D2|iPhone|iPod|Android|BlackBerry9530|LG-TU915\ Obigo|LGE\ VX|webOS|Nokia5800). [NC]
RewriteCond %{HTTP_user_agent} !^(w3c\ |w3c-|acs-|alav|alca|amoi|audi|avan|benq|bird|blac|blaz|brew|cell|cldc|cmd-|dang|doco|eric|hipt|htc_|inno|ipaq|ipod|jigs|kddi|keji|leno|lg-c|lg-d|lg-g|lge-|lg/u|maui|maxo|midp|mits|mmef|mobi|mot-|moto|mwbp|nec-|newt|noki|palm|pana|pant|phil|play|port|prox|qwap|sage|sams|sany|sch-|sec-|send|seri|sgh-|shar|sie-|siem|smal|smar|sony|sph-|symb|t-mo|teli|tim-|tosh|tsm-|upg1|upsi|vk-v|voda|wap-|wapa|wapi|wapp|wapr|webc|winw|winw|xda\ |xda-).* [NC]
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{HTTPS} !on
RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{SERVER_NAME}/$1/index.html.gz -f
RewriteRule ^(.*) "/wp-content/cache/supercache/%{SERVER_NAME}/$1/index.html.gz" [L] RewriteCond %{REQUEST_URI} !^.[^/]$
RewriteCond %{REQUEST_URI} !^.//.$
RewriteCond %{REQUEST_METHOD} !POST
RewriteCond %{QUERY_STRING} !.=.*
RewriteCond %{HTTP:Cookie} !^.(comment_author_|wordpress_logged_in|wp-postpass_).$
RewriteCond %{HTTP:X-Wap-Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP:Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP_USER_AGENT} !^.(2.0\ MMP|240x320|400X240|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|Googlebot-Mobile|hiptop|IEMobile|KYOCERA/WX310K|LG/U990|MIDP-2.|MMEF20|MOT-V|NetFront|Newt|Nintendo\ Wii|Nitro|Nokia|Opera\ Mini|Palm|PlayStation\ Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|SHG-i900|Small|SonyEricsson|Symbian\ OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|webOS|Windows\ CE|WinWAP|YahooSeeker/M1A1-R2D2|iPhone|iPod|Android|BlackBerry9530|LG-TU915\ Obigo|LGE\ VX|webOS|Nokia5800). [NC]
RewriteCond %{HTTP_user_agent} !^(w3c\ |w3c-|acs-|alav|alca|amoi|audi|avan|benq|bird|blac|blaz|brew|cell|cldc|cmd-|dang|doco|eric|hipt|htc_|inno|ipaq|ipod|jigs|kddi|keji|leno|lg-c|lg-d|lg-g|lge-|lg/u|maui|maxo|midp|mits|mmef|mobi|mot-|moto|mwbp|nec-|newt|noki|palm|pana|pant|phil|play|port|prox|qwap|sage|sams|sany|sch-|sec-|send|seri|sgh-|shar|sie-|siem|smal|smar|sony|sph-|symb|t-mo|teli|tim-|tosh|tsm-|upg1|upsi|vk-v|voda|wap-|wapa|wapi|wapp|wapr|webc|winw|winw|xda\ |xda-).* [NC]
RewriteCond %{HTTPS} on
RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{SERVER_NAME}/$1/index-https.html -f
RewriteRule ^(.*) "/wp-content/cache/supercache/%{SERVER_NAME}/$1/index-https.html" [L] RewriteCond %{REQUEST_URI} !^.[^/]$
RewriteCond %{REQUEST_URI} !^.//.$
RewriteCond %{REQUEST_METHOD} !POST
RewriteCond %{QUERY_STRING} !.=.*
RewriteCond %{HTTP:Cookie} !^.(comment_author_|wordpress_logged_in|wp-postpass_).$
RewriteCond %{HTTP:X-Wap-Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP:Profile} !^[a-z0-9"]+ [NC]
RewriteCond %{HTTP_USER_AGENT} !^.(2.0\ MMP|240x320|400X240|AvantGo|BlackBerry|Blazer|Cellphone|Danger|DoCoMo|Elaine/3.0|EudoraWeb|Googlebot-Mobile|hiptop|IEMobile|KYOCERA/WX310K|LG/U990|MIDP-2.|MMEF20|MOT-V|NetFront|Newt|Nintendo\ Wii|Nitro|Nokia|Opera\ Mini|Palm|PlayStation\ Portable|portalmmm|Proxinet|ProxiNet|SHARP-TQ-GX10|SHG-i900|Small|SonyEricsson|Symbian\ OS|SymbianOS|TS21i-10|UP.Browser|UP.Link|webOS|Windows\ CE|WinWAP|YahooSeeker/M1A1-R2D2|iPhone|iPod|Android|BlackBerry9530|LG-TU915\ Obigo|LGE\ VX|webOS|Nokia5800). [NC]
RewriteCond %{HTTP_user_agent} !^(w3c\ |w3c-|acs-|alav|alca|amoi|audi|avan|benq|bird|blac|blaz|brew|cell|cldc|cmd-|dang|doco|eric|hipt|htc_|inno|ipaq|ipod|jigs|kddi|keji|leno|lg-c|lg-d|lg-g|lge-|lg/u|maui|maxo|midp|mits|mmef|mobi|mot-|moto|mwbp|nec-|newt|noki|palm|pana|pant|phil|play|port|prox|qwap|sage|sams|sany|sch-|sec-|send|seri|sgh-|shar|sie-|siem|smal|smar|sony|sph-|symb|t-mo|teli|tim-|tosh|tsm-|upg1|upsi|vk-v|voda|wap-|wapa|wapi|wapp|wapr|webc|winw|winw|xda\ |xda-).* [NC]
RewriteCond %{HTTPS} !on
RewriteCond %{DOCUMENT_ROOT}/wp-content/cache/supercache/%{SERVER_NAME}/$1/index.html -f
RewriteRule ^(.*) "/wp-content/cache/supercache/%{SERVER_NAME}/$1/index.html" [L]</ifmodule> END WPSuperCache BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress0 -
Robots.txt is blocking Wordpress Pages from Googlebot?
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
Intermediate & Advanced SEO | | ENSO0 -
How to prevent Google from crawling our product filter?
Hi All, We have a crawler problem on one of our sites www.sneakerskoopjeonline.nl. On this site, visitors can specify criteria to filter available products. These filters are passed as http/get arguments. The number of possible filter urls is virtually limitless. In order to prevent duplicate content, or an insane amount of pages in the search indices, our software automatically adds noindex, nofollow and noarchive directives to these filter result pages. However, we’re unable to explain to crawlers (Google in particular) to ignore these urls. We’ve already changed the on page filter html to javascript, hoping this would cause the crawler to ignore it. However, it seems that Googlebot executes the javascript and crawls the generated urls anyway. What can we do to prevent Google from crawling all the filter options? Thanks in advance for the help. Kind regards, Gerwin
Intermediate & Advanced SEO | | footsteps0