Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Setting A Custom User Agent in Screaming Frog
- 
					
					
					
					
 Hi all, Probably a dumb question, but I wanted to make sure I get this right. How do we set a custom user agent in Screaming Frog? I know its in the configuration settings, but what do I have to do to create a custom user agent specifically for a website? Thanks much! - Malika
 
- 
					
					
					
					
 Setting a custom user agent determines things like HTTP/2 so there can be a big difference if you change it to something that might not take advantage of something like HTTP/2 Apparently, it is coming to Pingdom very soon just like it is to Googlebot http://royal.pingdom.com/2015/06/11/http2-new-protocol/ This Is an excellent example of a user agent's ability to modify the way your site is crawled as well as how efficient it is. https://www.keycdn.com/blog/https-performance-overhead/ It is important to note that we didn’t use Pingdom in any of our tests because they use Chrome 39, which doesn’t support the new HTTP/2 protocol. HTTP/2 in Chrome isn’t supported until Chrome 43. You can tell this by looking at the User-Agentin the request headers of your test results. Pingdom user-agent Note: WebPageTest uses Chrome 47 which does support HTTP/2. Hope that clears things up, Tom 
- 
					
					
					
					
 Hi Malika, Think about screaming frog and what it has to detect in order to do that correctly it needs the correct user agent syntax for it will not be able to make a crawl that would satisfy people. Using a proper syntax for a user agent is essential and I have tried to be non-technical in this explanation I hope it works. the reason screaming frog needs the user agent because the user-agent was added to HTTP to help web application developers deliver a better user experience. By respecting the syntax and semantics of the header, we make it easier and faster for header parsers to extract useful information from the headers that we can then act on. Browser vendors are motivated to make web sites work no matter what specification violations are made. When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect It is only by us application developers developing a healthy respect for the standards of the web, that the browser vendors will be able to start tightening up their codebase knowing that they don’t need to account for non-conformances. For client libraries that do not enforce the syntax rules, you run the risk of using invalid characters that many server side frameworks will not detect. It is possible that only certain users, in particular, environments would identify the syntax violation. This can lead to difficult to track down bugs. I hope this is a good explanation I've tried to keep it very to the point. Respectfully, Thomas 
- 
					
					
					
					
 Hi Thomas, would you have a simpler tutorial for me to understand? I am struggling a bit. Thanks heaps in advance  
- 
					
					
					
					
 I think I want something that is dumbed down to my level for me to understand. The above tutorials are great but not being a full time coder, I get lost while reading those. 
- 
					
					
					
					
 Hi Matt, I havent had a luck with this one yet.  
- 
					
					
					
					
 Hi Malika! How'd it go? Did everything work out?  
- 
					
					
					
					
 happy I could be of help let me know if there's any issue and I will try to be of help with it. All the best 
- 
					
					
					
					
 Hi Thomas, That's a lot of useful information there. I will have a go on it and let you know how it went.  Thanks heaps! 
- 
					
					
					
					
 please let me know if I did not answer the question or you have any other questions 
- 
					
					
					
					
 this gives you a very clear breakdown of user agents and their set of syntax rules. The following is valid example of user-agent that is full of special characters, read this please http://www.bizcoder.com/the-much-maligned-user-agent-header user-agent: foo&bar-product!/1.0a$*+ (a;comment,full=of/delimitersreferences but you want to pay attention to the first URL https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference | Mozilla/5.0 (X11; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 | http://stackoverflow.com/questions/15069533/http-request-header-useragent-variable 
- 
					
					
					
					
 if you formatted it correctly see below User-Agent = product *( RWS ( product / comment ) )and it was received by your headers yes you could fill in the blanks and test it. https://mobiforge.com/research-analysis/webviews-and-user-agent-strings http://mobiforge.com/news-comment/standards-and-browser-compatibility 
- 
					
					
					
					
 No, you Cannot just put anything in there. The site has to recognize it and ask why you are doing this? I have listed how to build and already built in addition to what your browser will create by using useragentstring.com Must be formatted correctly and have it work with a header it is not as easy as it sometimes seems but not that hard either. You can make & use this to make your own from your Mac or PC http://www.useragentstring.com/ Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2747.0 Safari/537.36 how to build a user agent - https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference
- https://developer.mozilla.org/en-US/docs/Setting_HTTP_request_headers
- https://msdn.microsoft.com/en-us/library/ms537503(VS.85).aspx
 Lists of user agents https://support.google.com/webmasters/answer/1061943?hl=en https://msdn.microsoft.com/en-us/library/ms537503(v=vs.85).aspx 
- 
					
					
					
					
 Hi Thomas, Thanks for responding, much appreciated! Does that mean, if I type in something like - HTTP request user agent - Crawler access V2 & Robots user agent Crawler access V2 This will work too? 
- 
					
					
					
					
 To crawl using a different user agent, select ‘User Agent’ in the ‘Configuration’ menu, then select a search bot from the drop-down or type in your desired user agent strings. http://i.imgur.com/qPbmxnk.png & Video http://cl.ly/gH7p/Screen Recording 2016-05-25 at 08.27 PM.mov Or Also see http://www.seerinteractive.com/blog/screaming-frog-guide/ https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#user-agent https://www.screamingfrog.co.uk/seo-spider/user-guide/ https://www.screamingfrog.co.uk/seo-spider/faq/ 
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Best way to set up URL structure for reviews off of PDP pages.
 We are adding existing customer reviews to Product Detail Pages pages. There are about 300 reviews per product so we're going to have to paginate reviews off of the PDP page. I'm wondering what the best url structure for reviews pages is to get the most seo benefit. For example, would it be something like this? site.com/category/product/reviews/page-1 or something that used parameters, such as: site.com/reviews?product=a Also, what is the best way to show that the internal link on the PDP page to "All Reviews" is a higher priority link than the other links on the page? Intermediate & Advanced SEO | | katseo10
- 
		
		
		
		
		
		Kind of duplicate categories and custom taxonomy. Necessary, but bad for SEO?
 Hello Everyone! I'm new here! My husband and I are working on creating a website: https://sacwellness.com .The site is an online therapist directory for the the Sacramento California area. Our problem is this: In wordpress our category system is being used for blog posts. Our theme is using a custom taxonomy system to categorize different therapist specialties, therapeutic approaches, etc. We've found ourselves in a position where our custom taxonomy and categories are near duplicates. for example we have the blog categories: ADHD counseling, Anxiety therapy, and Career counseling our corresponding custom taxonomy/therapist categories are: ADHD, Anxiety, and....(oops) career counseling. My understanding is that google doesn't see a difference between identically named categories and custom taxonomies and will so choose one to rank and disregard the other, effectively leaving you competing against yourself. is this true in a case like this? Can google maybe understand the difference because of the custom taxonomy and/or URL paths? if this is a problem is it ok to have near duplicates....like ADHD vs. ADHD counseling. This has been our solution so far....but now we're questioning it....derp x_x. I thought about tagging the categories with no index, but I think the archive pages would be useful for people. Essentially we have 2 sets of archives for each keyword. One is for blog posts, and one is for therapists who work with that particular issue along with the 6 most recent blog posts in that category.....because we are putting the 6 most recent blog posts at the bottom of the therapist pages I feel like it wouldn't be as terrible of a loss if we had to noindex the category pages. ....what do you think? Thank you! Intermediate & Advanced SEO | | angelamaemae0
- 
		
		
		
		
		
		Tool for user intent
 Hello, Is there a tool that can tell me what the user intent of my keyword is and how I should present my page (the type of content users want to see it, what questions they want answered ? Thank you, Intermediate & Advanced SEO | | seoanalytics0
- 
		
		
		
		
		
		Best Permalinks for SEO - Custom structure vs Postname
 Good Morning Moz peeps, I am new to this but intending on starting off right! I have heard a wealth of advice that the "post name" permalink structure is the best one to go with however... i am wondering about a "custom structure" combing the "post name" following the below example structure: Www.professionalwarrior.com/bodybuilding/%postname/ Where "professional" and "bodybuilding" is my focus/theme/keywords of my blog that i want ranked. Thanks a mill, RO Intermediate & Advanced SEO | | RawkingOut0
- 
		
		
		
		
		
		Set Placeholder Page ASAP or Wait For Full Website?
 It can take some time for a new business website to get picked up by all the search engines and indexed. Let's assume it's going to take a month to build your new full-fledged business website. Would it be advantageous in the mean time to immediately launch the domain with an introductory website using a template site so you might have just two pages, a home page with logo, title, brief description of pages, a couple images, etc and a contact page. Would this help give the site a "jump start" on being indexed? Or could that do more harm than good by putting up something "quick & dirty" versus the complete website with much more content, that has been SEO optimized? Intermediate & Advanced SEO | | Jazee0
- 
		
		
		
		
		
		How does Infinite Scrolling work with unique URLS as users scroll down? And is this SEO friendly?
 I was on a site today and as i scrolled down and viewed the other posts that were below the top one i read, i noticed that each post below the top one had its own unique URL. I have not seen this and was curious if this method of infinite scrolling is SEO friendly. Will Google's spiders scroll down and index these posts below the top one and index them? The URLs of these lower posts by the way were the same URLs that would be seen if i clicked on each of these posts. Looking at Google's preferred method for Infinite scrolling they recommend something different - https://webmasters.googleblog.com/2014/02/infinite-scroll-search-friendly.html . Welcome all insight. Thanks! Christian Intermediate & Advanced SEO | | Sundance_Kidd0
- 
		
		
		
		
		
		Strange 404s in Screaming Frog
 I just ran a website (Drupal) through screaming frog and the only 404s I found related to web pages which were the same as URLs already used on the website plus the company phone number so... www.company.com/[their phone number] - www.company.com/services[their phone number] - any ideas what might be causing this problem? Intermediate & Advanced SEO | | McTaggart0
- 
		
		
		
		
		
		Can't crawl website with Screaming frog... what is wrong?
 Hello all - I've just been trying to crawl a site with Screaming Frog and can't get beyond the homepage - have done the usual stuff (turn off JS and so on) and no problems there with nav and so on- the site's other pages have indexed in Google btw. Now I'm wondering whether there's a problem with this robots.txt file, which I think may be auto-generated by Joomla (I'm not familiar with Joomla...) - are there any issues here? [just checked... and there isn't!] If the Joomla site is installed within a folder such as at e.g. www.example.com/joomla/ the robots.txt file MUST be moved to the site root at e.g. www.example.com/robots.txt AND the joomla folder name MUST be prefixed to the disallowed path, e.g. the Disallow rule for the /administrator/ folder MUST be changed to read Disallow: /joomla/administrator/ For more information about the robots.txt standard, see: http://www.robotstxt.org/orig.html For syntax checking, see: http://tool.motoricerca.info/robots-checker.phtml User-agent: * Intermediate & Advanced SEO | | McTaggart
 Disallow: /administrator/
 Disallow: /bin/
 Disallow: /cache/
 Disallow: /cli/
 Disallow: /components/
 Disallow: /includes/
 Disallow: /installation/
 Disallow: /language/
 Disallow: /layouts/
 Disallow: /libraries/
 Disallow: /logs/
 Disallow: /modules/
 Disallow: /plugins/
 Disallow: /tmp/0
 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				