Suggested Screaming Frog configuration to mirror default Googlebot crawl?
-
Hi All,
Does anyone have a suggested Screaming Frog (SF) configuration to mirror default Googlebot crawl? I want to test my site and see if it will return 429 "Too Many Requests" to Google.
I have set the User Agent as Googlebot (Smartphone). Is the default SF Menu > Configuration > Speed > Max Threads 5 and Max URLs 2.0 comparable to Googlebot?
Context:
I had tried NetPeak SEO Spider which did a nice job and had a cool feature that would pause a crawl if it got to many 429. Long Story short, B2B site threw 429 Errors when there should have been no load on a holiday weekend at 1:00 AM.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What happens to crawled URLs subsequently blocked by robots.txt?
We have a very large store with 278,146 individual product pages. Since these are all various sizes and packaging quantities of less than 200 product categories my feeling is that Google would be better off making sure our category pages are indexed. I would like to block all product pages via robots.txt until we are sure all category pages are indexed, then unblock them. Our product pages rarely change, no ratings or product reviews so there is little reason for a search engine to revisit a product page. The sales team is afraid blocking a previously indexed product page will result in in it being removed from the Google index and would prefer to submit the categories by hand, 10 per day via requested crawling. Which is the better practice?
Intermediate & Advanced SEO | | AspenFasteners1 -
SEO suggestions for a directory
Hi all, I am new to SEO. I work for a ratings and review website, like TripAdvisor and LinkedIn. How would one go about setting up SEO strategy for national directories that have local suggested pages? What can be a good practice. For example, Tripadvisor has many different restaurants across the UK. What would they do to improve their SEO? How do they target correct links? How do they go about building their Moz Score? Would really appreciate your thoughts and suggestions. Thanks!
Intermediate & Advanced SEO | | Eric_S
Eric0 -
Which One Would You Suggest Me in Terms of Internalization?
Hi Friends, This is my website http://goo.gl/fYndv. As of now, we have only one domain and we have contents in both English & Arabic. Arabic is translated content from English. So, we use alternate tags to indicate Google about that. We mostly receive traffic from Saudi Arabia because we are based out there. Now, we are planning to target major countries like India, Australia & So on. We know like creating sub-folders over sub-domains would be good like example.com/in/ over in.exmaple.com. But we are not going to change any contents only currency gets changed in those geo-graphic sub-domains or sub-folders. I just want to know, since I am not going to change the contents will it be good if I go with sub-folder like example.com/in. Is there any chance for Google penalization?
Intermediate & Advanced SEO | | Prabhu.Sundar0 -
Download all GSC crawl errors: Possible today?
Hey guys: I tried to download all the crawl data from Google Search Console using the API and solutions like this one: https://github.com/eyecatchup/php-webmaster-tools-downloads but seems that is not longer working (or I made something wrong, I just receive a blank page when running the PHP file after some load time)... I needed to download more than 1.000 URLs long time ago, so I didn't tried to use this method since then. Is there any other solution using the API to grab all the crawl errors, or today this is not possible anymore? Thanks!
Intermediate & Advanced SEO | | antonioaraya1 -
Would spiders successfully crawl a page with two distinct sets of content?
Hello all and thank you in advance for the help. I have a coffee company that sell both retail and wholesale products. These are typically the same product, just at different prices. We are planning on having a pop up for users to help them self identify upon their first visit asking if they are retail or wholesale clients. So if someone clicks retail, the cookie will show them retail pricing throughout the site and vice versa for those that identify themselves as wholesale. I can talk to our programmer to find out how he actually plans on doing this from a technical standpoint if it would be of assistance. My question is, how will a spider crawl this site? I am assuming (probably incorrectly) that whatever the "default" selection is (for example, right now now people see retail pricing and then opt into wholesale) will be the information/pricing that they index. So long story short, how would a spider crawl a page that has two sets of distinct pricing information displayed based on user self identification? Thanks again!
Intermediate & Advanced SEO | | ClayPotCreative0 -
Help with domain configuration in international markets
We have just taken on a new client who’s core business is based in the UK. They are currently developing a market in Australia and are hoping to develop future markets over the next few years. The website is currently hosted in the UK, under a .co.uk domain, we are redeveloping the website and are wondering if we should move to the .com as the company is now catering to an international market. I was also wondering what the best way of establishing a presence in Australia, using the main website, rather than creating a new one. Thanks Fraser
Intermediate & Advanced SEO | | fraserhannah0 -
Why specify robots instead of googlebot for a Panda affected site?
Daniweb is the poster child for sites that have recovered from Panda. I know one strategy she mentioned was de-indexing all of her tagged content, fo rexample: http://www.daniweb.com/tags/database Why do you think more Panda affected sites specifying 'googlebot' rather than 'robots' to capture traffic from Bing & Yahoo?
Intermediate & Advanced SEO | | nicole.healthline0 -
How to find what Googlebot actually sees on a page?
1. When I disable java-script in Firefox and load our home page, it is missing entire middle section. 2. Also, the global nav dropdown menu does not display at all. (with java-script disabled) I believe this is not good. 3. But when type in <website name="">in Google search and click on the cached version of home page > and then click on text only version, It displays the Global nav links fine.</website> 4. When I switch the user agent to Googlebot(using Firefox plugin "User Agent Swticher)), the home page and global nav displays fine. Should I be worried about#1 and #2 then? How to find what Googlebot actually sees on a page? (I have tried "Fetch as Googlebot" from GWT. It displays source code.) Thanks for the help! Supriya.
Intermediate & Advanced SEO | | Amjath0