Setting A Custom User Agent in Screaming Frog
-
Hi all,
Probably a dumb question, but I wanted to make sure I get this right.
How do we set a custom user agent in Screaming Frog? I know its in the configuration settings, but what do I have to do to create a custom user agent specifically for a website?
Thanks much!
- Malika
-
Setting a custom user agent determines things like HTTP/2 so there can be a big difference if you change it to something that might not take advantage of something like HTTP/2
Apparently, it is coming to Pingdom very soon just like it is to Googlebot
http://royal.pingdom.com/2015/06/11/http2-new-protocol/
This Is an excellent example of a user agent's ability to modify the way your site is crawled as well as how efficient it is.
https://www.keycdn.com/blog/https-performance-overhead/
It is important to note that we didn’t use Pingdom in any of our tests because they use Chrome 39, which doesn’t support the new HTTP/2 protocol. HTTP/2 in Chrome isn’t supported until Chrome 43. You can tell this by looking at the
User-Agent
in the request headers of your test results.Pingdom user-agent
Note: WebPageTest uses Chrome 47 which does support HTTP/2.
Hope that clears things up,
Tom
-
Hi Malika,
Think about screaming frog and what it has to detect in order to do that correctly it needs the correct user agent syntax for it will not be able to make a crawl that would satisfy people.
Using a proper syntax for a user agent is essential and I have tried to be non-technical in this explanation I hope it works.
the reason screaming frog needs the user agent because the user-agent was added to HTTP to help web application developers deliver a better user experience. By respecting the syntax and semantics of the header, we make it easier and faster for header parsers to extract useful information from the headers that we can then act on.
Browser vendors are motivated to make web sites work no matter what specification violations are made. When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect
When the developers building web applications don’t care about following the rules, the browser vendors work to accommodate that. It is only by us application developers developing a healthy respect
It is only by us application developers developing a healthy respect for the standards of the web, that the browser vendors will be able to start tightening up their codebase knowing that they don’t need to account for non-conformances.
For client libraries that do not enforce the syntax rules, you run the risk of using invalid characters that many server side frameworks will not detect. It is possible that only certain users, in particular, environments would identify the syntax violation. This can lead to difficult to track down bugs.
I hope this is a good explanation I've tried to keep it very to the point.
Respectfully,
Thomas
-
Hi Thomas,
would you have a simpler tutorial for me to understand? I am struggling a bit.
Thanks heaps in advance
-
I think I want something that is dumbed down to my level for me to understand. The above tutorials are great but not being a full time coder, I get lost while reading those.
-
Hi Matt,
I havent had a luck with this one yet.
-
Hi Malika! How'd it go? Did everything work out?
-
happy I could be of help let me know if there's any issue and I will try to be of help with it. All the best
-
Hi Thomas,
That's a lot of useful information there. I will have a go on it and let you know how it went.
Thanks heaps!
-
please let me know if I did not answer the question or you have any other questions
-
this gives you a very clear breakdown of user agents and their set of syntax rules. The following is valid example of user-agent that is full of special characters,
read this please http://www.bizcoder.com/the-much-maligned-user-agent-header
user-agent: foo&bar-product!/1.0a$*+ (a;comment,full=of/delimiters
references but you want to pay attention to the first URL
https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference
| Mozilla/5.0 (X11; Linux i686; rv:10.0) Gecko/20100101 Firefox/10.0 |
http://stackoverflow.com/questions/15069533/http-request-header-useragent-variable
-
if you formatted it correctly see below
User-Agent = product *( RWS ( product / comment ) )
and it was received by your headers yes you could fill in the blanks and test it.
https://mobiforge.com/research-analysis/webviews-and-user-agent-strings
http://mobiforge.com/news-comment/standards-and-browser-compatibility
-
No, you Cannot just put anything in there. The site has to recognize it and ask why you are doing this?
I have listed how to build and already built in addition to what your browser will create by using useragentstring.com
Must be formatted correctly and have it work with a header it is not as easy as it sometimes seems but not that hard either.
You can make & use this to make your own from your Mac or PC
http://www.useragentstring.com/
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2747.0 Safari/537.36
how to build a user agent
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference
- https://developer.mozilla.org/en-US/docs/Setting_HTTP_request_headers
- https://msdn.microsoft.com/en-us/library/ms537503(VS.85).aspx
Lists of user agents
https://support.google.com/webmasters/answer/1061943?hl=en
https://msdn.microsoft.com/en-us/library/ms537503(v=vs.85).aspx
-
Hi Thomas,
Thanks for responding, much appreciated!
Does that mean, if I type in something like -
HTTP request user agent -
Crawler access V2
&
Robots user agent
Crawler access V2
This will work too?
-
To crawl using a different user agent, select ‘User Agent’ in the ‘Configuration’ menu, then select a search bot from the drop-down or type in your desired user agent strings.
http://i.imgur.com/qPbmxnk.png
&
Video http://cl.ly/gH7p/Screen Recording 2016-05-25 at 08.27 PM.mov
Or
Also see
http://www.seerinteractive.com/blog/screaming-frog-guide/
https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#user-agent
https://www.screamingfrog.co.uk/seo-spider/user-guide/
https://www.screamingfrog.co.uk/seo-spider/faq/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can Schema handle two sets of business hours?
I have a client who, due to covid, will have two sets of business hours. Morning hours for business customers, and afternoon hours for general customers. Is it possible to designate this distinction in schema?
Intermediate & Advanced SEO | | bherman0 -
How to manage user profiles in your website?
We have a real estate website in which agents and builders can create their profiles. My question is shall we use h1 or h2 tags in business profile pages or make them according to web 2.0 standards? In case header tags are used, if two agents have the same name and we have used h2 tag for them, then search result page will end up having two same h2's. Can someone please tell me the right way to manage business profiles in a website? Thanks
Intermediate & Advanced SEO | | dailynaukri1 -
SEO Best Practices for Customer Portals
We have a customer portal which is used to display customer's serial numbers, the knowledgebase, support ticketing information and forum. This information is behind a wall as a user must have support in order to view. The question is what are the best practices for SEO with the customer portal? Should we block these sections from bots? Is there a way to take advantage of the number of pages that are within the portal?
Intermediate & Advanced SEO | | ASCI-Marketing0 -
Same URLS different CMS and server set up. Page Authority now 1
We have moved a clients website over to a new CMS and onto a new server. The Domain and URLs on the main pages of the website are exactly the same so we did not do any 301 re directs. The overall Domain Authority of the site and the Page Authority of the Homepage, while having dropped a bit seem OK. However all the other pages now have a Pagerank of 1 I'm not exactly sure what the IT guys have done but there was some re routing on the server level applied. The move happened around the end of December 2014 And yes traffic has dropped significantly Any ideas?
Intermediate & Advanced SEO | | daracreative0 -
Best way to set up anchor text on parked pages?
Our company is no longer offering a series of products, much to the disappointment of our SEO team since we've spent a long time building up the pages and getting them ranked organically. The pages all have decent page rank and in some cases rank #1 for the primary keyword. We have a sister company that we acquired a year ago and they still offer these products on their website. They are a completely separate company with their own website which existed long before we acquired them and we have nothing to do with their website. Our team has proposed that rather than take down the URLs on our site for the products we no longer offer, to put a message saying something like "sorry we don't offer this anymore but you may be interested in this.." and then link to our sister company with anchor text so that they can get some benefit from our SEO efforts if we can't. The question/issue is how should we do that since there will be a lot of pages from the same domain, about 20 pages, all linking to a few pages on a different domain. Should the anchor text be varied unbranded or branded? On the one hand I think if we change up the anchor text used to link to another page many times from a single domain that looks strange and transparent to google. On the other hand unbranded text would be the better descriptor for users since we are deep linking to the product not the homepage of the other site.
Intermediate & Advanced SEO | | edu-SEO0 -
Re-Direct Users But Don't Affect Googlebot
This is a fairly technical question... I have a site which has 4 subdomains, all targeting a specific language. The brand owners don't want German users to see the prices on the French sub domain and are forcing users into a re-direct to the relevant subddomain, based on their IP address. If a user comes from a different country, (ie the US) they are forced on the UK sub domain. The client is insistent on keeping control of who sees what (I know that's a debate in it's own right), but these re-directs we're implementing to make that happen, are really making it difficult to get all the subdomains indexed as I think googlebot is also getting re-directed and is failing to do it's job. Is there are a way of re-directing users, but not Googlebot?
Intermediate & Advanced SEO | | eventurerob0 -
Best way to re-order page elements based on search engine users
Both versions of the page has essentially same content, but in different order. One is for users coming from Google (and google bot) and other is for everybody else. Questions: Is it cloaking? what will be the best way to re-order elements on the page: totally different style sheets for each version, or calling in different divs in a same style sheet? Is there any better way to re-order elements based on search engine? Let me make it clear again: the content is same for everyone, just in different order for visitors coming from Google and everybody else. Don't ask me the reason behind it (executive orders!!)
Intermediate & Advanced SEO | | StickyRiceSEO0 -
Title tag showing in Google that we are not setting
Hello, We've noticed that when we do a specific search (print screen attached), that the business name and/or a completely different title is getting indexed into the search engine that we are not setting. Below is an example from the source code of how we're setting the title, this matches the 2nd listing circled in the attached image. The indexed title tag reflects "Animal Business Card Holders - Kyle Design" Any ideas or feedback on how this is happening? <title>Animal Business Card Cases in Pet, Insect and Wildlife Designstitle> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <meta name="description" content="Eye-catching business card holder cases personalized with custom animal designs for humane professionals and pet owners. Custom select a sleek metal finish, bold aluminum or iridescent accent color, size and unique design for the ultimate self-expressing animal gift!" /> <meta name="keywords" content="business card holder unique personalized custom holders silver gold wood metal cards cases sleek aluminum engraved contemporary case animal animals design designs black color accents iridescent pet insect wildlife cat dog dragonfly butterfly lions sea turtles sea otters elephants animal lover animal activist zoologist veterinarian breeder animal whisperer thin deep large credit Asian size engraving personalize gift gifts special monogram customized corporate logo name professional title meaningful sentiment" /> <meta name="copyright" content="Copyright Kyle Design" /> <meta name="author" content="Kyle Design" />
Intermediate & Advanced SEO | | marketing_zoovy.com
<meta name="generator" content="xyz Commerce System http://www.domain.com/" />
<link rel="canonical" href="xyz link"
<script type="text/javaScript"> Thanks,
Jamie0